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AUTOPILOT BUDGETING: WILL CONGRESS 
EVER RESPOND TO GOVERNMENT 
PERFORMANCE DATA? 


TUESDAY, JUNE 13, 2006 

U.S. Senate, 

Subcommittee on Federal Financial Management, 
Government Information, and International Security 
OF THE Committee on Homeland Security and 

Governmental Affairs, 
Washington, DC. 

The Subcommittee met, pursuant to notice, at 2:57 p.m., in room 
342, Dirksen Senate Office Building, Hon. Tom Coburn (Chairman 
of the Subcommittee) presiding. 

Present: Senators Coburn and Carper. 

OPENING STATEMENT OF SENATOR COBURN 

Senator Coburn. Good afternoon. The Federal Financial Man- 
agement Subcommittee of the Homeland Security and Govern- 
mental Affairs Committee will come to order. Senator Carper will 
be here in a moment. We apologize for the delay. There was an offi- 
cial photo. We also have a conflict. There is a briefing ongoing now 
by the Secretary of Defense and the Secretary of State, which will 
limit Senator Carper’s time with us. So we are going to go on and 
go forward so we have it in the record. I apologize for the con- 
flicting schedules. 

Americans have a crazy idea, that they should get something for 
their money, even when the money is spent by the government. It 
is a simple concept, and in policy-speak we call it performance- 
based budgeting. I know I am new in the Senate, but I am still sur- 
prised by the level of resistance in Washington to holding people 
accountable by measuring their performance. And it is a difficult 
thing to do. A multitrillion-dollar government imposing some sort 
of standardized outcome evaluation is difficult at best, and what it 
implies is that the tool will be very crude. But that does not say 
we should not attempt to make measurements, and I want to be 
one of many who should commend both Mr. Johnson and the Bush 
Administration, and the President himself, for being the first to at- 
tempt to do it. 

It is not novel. It is required in the competitive business environ- 
ment that we find ourselves worldwide. It is being used effectively 
in many State governments, and it is something that is long over- 
due. The Performance Assessment Rating Tool (PART) was first in- 
troduced by the President 4 years ago as a tool to review the 
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strengths and weaknesses of government programs to influence 
funding and programmatic decisions. The annual PART reports 
offer needed sunshine in government and provide good data for 
government managers to improve their programs. 

Today, the Office of Management and Budget has reviewed 793 
programs, which account for $1.47 trillion in taxpayer money. Al- 
most a third of these programs have proven to not meet up to 
standards based on the PART analysis. I have already admitted 
that it is a blunt tool. One-third of $1.5 trillion is $500 billion. 
Maybe this is why PART scores so far have created a stir not only 
among the agencies but among the Members of Congress who make 
budgeting decision. 

Some Members of Congress want to stick their heads in the sand 
and keep funding pet programs on autopilot year after year. To my 
amazement, just last week, the Appropriations Subcommittee that 
funds the Departments of Labor and Health and Human Services 
passed language prohibiting the use of PART assessments on those 
agencies. They may not like PART’S message, but they should not 
shoot the messenger. This sort of Orwellian immunization against 
any hint that our favorite programs may not be performing up to 
the idealized, utopian goals of their Congressional champions is one 
of the reasons why Americans are mad at Congress. 

The approval ratings for Congress are in the tank, and this pro- 
hibition of accountability for failing government is why the voters 
who fork over their hard-earned dollars every year may just have 
something new to say come this November. I am not sure why so 
many of my colleagues are afraid of assessment tools on perform- 
ance. It may reflect their own performance. 

As part of our investigation for this hearing, we learned that low 
PART ratings do not always mean that 0MB will recommend a 
budget cut or a cut in the program or a recommendation to go on 
the terminations list. In some cases, programs rated ineffective 
have had budget reductions recommended. But in other cases, the 
reason they were low was because they were not funded appro- 
priately to begin with, and therefore, they could not accomplish 
what they were intended to because they did not have adequate 
funding. 

Each program is unique, and I do not know that a PART score 
should be the last word. But I do know that measurement of per- 
formance is something that every member of a Congressional au- 
thorizing or Sppropriations committee should be reading and using 
to inform their oversight work. Congress consistently neglects the 
duty to conduct oversight of Federal programs and spending. In- 
stead, we spend most of the time passing spending bills that ignore 
PART ratings, the President’s termination list, or any other per- 
formance data as if the spending were on autopilot. Congress might 
as well write a blank check. 

By 2008, 0MB will have applied PART to the entire government. 
In the last 4 years, 0MB has scored 793 government programs. 
Here are the results: 15 percent were found to be effective; 29 per- 
cent were found to be moderately effective; 28 percent were rated 
adequate; 4 percent were found to be ineffective — that is one in 
every 25 programs — 25 percent could not demonstrate results to 
get a rating and were labeled results not demonstrated. 



3 


I do not believe the spin that results not demonstrated can mean 
that the program is either good or bad; we just do not have enough 
information to tell. On the contrary, the results not demonstrated 
designation is a red flag marking a program so poorly conceived by 
us or so directionalist that that unaccountability seems to have 
been built into it by design. Programs rated ineffective or results 
not demonstrated account for $152 billion in budget authority. 
Imagine what we could do with $152 billion right now. The ideas 
are endless. 

Outside of Washington, DC, any business or family with finite re- 
sources sets priorities and creates a budget based on the actual 
amount of bang they get from their buck. It is only inside the Belt- 
way where that kind of information is not considered relevant, and 
in fact, some are even attempting to ban the collection of such in- 
formation. But then, it is only Washington where you never have 
to declare bankruptcy, and debt is allowed to grow on the backs of 
future generations with impunity. 

Let me give you one case study, and my co-chairman on this will 
disagree, b^ut my firm believing is the following: We held a hearing 
last year on the Advanced Technology Program that was created in 
1988 to subsidize high-risk research and development. This pro- 
gram has never demonstrated results. What it has demonstrated is 
corporate welfare. Its 2002 PART report, that the majority of ATP 
grants go to multibillion dollar corporations and that the GAO has 
found that ATP projects are very similar to private sector R&D un- 
dertaken without a government subsidy. An amendment to elimi- 
nate this funding that was offered last year lost by a vote of 68 to 
29. In the end. Congress wasted a portion of $79 million last year 
for that program. The 2007 Senate budget resolution promises to 
fund the program at almost twice that amount. 

It would be one thing if we were operating in a surplus. Then, 
we could have a legitimate debate about whether to keep failing 
programs, hoping that they would improve, or to give that surplus 
back to the taxpayers. But that is not where we are today. With 
a debt burden of $25,000 per man, woman, and child, we simply 
cannot afford to keep funding programs that cannot prove their 
worth. Non-defense discretionary spending has increased 45 per- 
cent since 2001. The President has requested a $2.8 trillion budget, 
and that does not include any of the so-called emergency, “supple- 
mental bills in our future,” nor does it include the late night pork 
barrel frenzy each time Congress schedules an appropriations bill 
vote. 

Entitlement spending will tank our economy if we do not do 
something to get spending under control. The question remains: 
How do we get Congress to act? I would like to see 0MB sell their 
PART terminations list more aggressively, forcibly sell the reforms 
and savings to Congress, fight for the cuts by taking the termi- 
nations list to the American people with the power of the bully pul- 
pit. The President should veto spending bills that continue to issue 
blank checks for failing programs. 

There is a bit of hope on the horizon. I was encouraged to see 
that the House Appropriations Committee wrote in their 2006 
budget savings report that the only way to establish accountability 
in the budget process is to stop spending on programs that have 
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outlived their usefulness or could be delivered more effectively at 
the State or local level. I will believe that when I see it, but I wel- 
come any help that we can get. 

The best place to start is by immediately defunding all programs 
on the termination list and adopting other PART recommendation 
reductions. Granted, the list only cuts $20 billion from a $2.8 tril- 
lion budget, but we have got to start somewhere. What is more, we 
should suspend the creation of any new program until further no- 
tice or it is compared to the existing programs that it is meant to 
supplement. We need sunset legislation that would phase out gov- 
ernment agencies on a timed basis, where we force ourselves to 
look at them and to reauthorize them. 

These are challenging times, and we can no longer afford to run 
on a budget that is on cruise control. I want to thank our witnesses 
for being here. 

[The prepared statement of Chairman Coburn follows:] 

OPENING PREPARED STATEMENT OF CHAIRMAN COBURN 

Americans have a crazy idea: They should get something for their money, even 
when the money is spent by government. It’s a simple concept — in policy-speak, we 
call it “performance-based budgeting.” I know I’m new in the Senate, but I’m still 
surprised by how much resistance there is in Washington to performance-based 
budgeting. 

Now, to be fair, taking a multi-trillion dollar government and imposing some sort 
of standardized outcome evaluation on it is difficult at best. So I concede that any 
instrument we use will be a blunt instrument. But I want to commend President 
Bush for being the first to try. 

The Performance Assessment Rating Tool (PART) was first introduced by the 
President 4 years ago as a tool to review the strengths and weaknesses of govern- 
ment programs to influence funding and programmatic decisions. 

The annual PART reports offer needed sunshine in government and provide good 
data for government managers to improve their programs. To date, the Office of 
Management and Budget has reviewed 793 programs which account for $1.47 tril- 
lion in taxpayer money. Almost a third of these programs have proven either totally 
ineffective or are not demonstrating results. One-third of $1.5 trillion is $500 billion. 

Maybe this is why the PART scores have created a stir — not only among the agen- 
cies, but among the Members of Congress who make budgeting decisions. Some 
Members of Congress want to stick their head in the sand and keep funding their 
pet programs, as if on autopilot, year after year. 

Just last week the House Appropriations subcommittee that funds the Depart- 
ments of Labor, Education and Health and Human Services passed language prohib- 
iting the use of PART assessments on those agencies. They may not like PART’s 
message, but they shouldn’t shoot the messenger. This sort of Orwellian immuniza- 
tion against any hint that our favorite programs may not be performing up to the 
idealized utopian goals of their Congressional champions is why Americans are mad 
at Congress. The approval ratings for Congress are in the tank, and this prohibition 
of accountability for failing government is why the voters who fork over their hard- 
earned dollars every year may just have something to say come November. 

I’m not sure why some of my colleagues are so afraid of PART. As part of our 
investigation for this hearing, we learned that low PART ratings don’t always mean 
that 0MB will recommend a budget cut or put the program on the Terminations 
List. In some cases, programs rated “ineffective” had budget reductions, but in other 
cases their budgets increased. Each program is unique and I don’t know that a 
PART score should be the last word, but I do know that the PART is something 
every member of a Congressional authorizing or Appropriations committee should 
be reading and using to inform their oversight work. 

You see. Congress consistently neglects the duty to conduct oversight of Federal 
programs and spending. Instead, we spend most of the time passing spending bills 
that ignore PART ratings, the President’s terminations list and any other perform- 
ance data. It is as if we’re spending on “auto pilot” — Congress might as well just 
write a blank check. 

By 2008, 0MB will have applied PART to the entire government. In the last 4 
years 0MB has scored 793 government programs. Here are the results: Just 15 per- 
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cent were found to be “effective”; 29 percent were rated “moderately effective”; 28 
percent were rated “adequate”; 4 percent were found to be “ineffective”; and 24 per- 
cent cannot demonstrate results to even get a rating and were labeled “results not 
demonstrated”! Don’t believe the spin that “results not demonstrated” could mean 
that the program is either good or bad, we just don’t have enough information to 
tell. On the contrary — the “results not demonstrated” designation is a red flag mark- 
ing a program so poorly conceived or directionless that unaccountability seems to 
have been built into it by design. 

Programs rated “ineffective” or “results not demonstrated” account for $152 billion 
in budget authority. Imagine what we could do with $152 billion. 

Outside of Washington DC, any business or family with finite resources sets prior- 
ities and creates a budget based on the actual amount of bang they get for their 
hard-earned huck. It is only inside the beltway where that kind of information isn’t 
considered relevant and in fact, some are trying hard to ban the collection of such 
information. But then, it’s only in Washington where you never have to declare 
bankruptcy and debt is allowed to grow on the backs of future generations with im- 
punity. 

Let me give you one case study. We held a hearing last year on the Advanced 
Technology Program. The program was created by Congress in 1988 to subsidize 
high-risk research and development. The program cannot demonstrate results. It is 
corporate welfare. The 2002 PART reported that the majority of ATP grants go to 
multimillion dollar corporations and that the GAO has found that ATP projects are 
very similar to private sector R&D undertaken without a government subsidy. An 
amendment to eliminate funding for ATP that I offered last year was voted down 
in the Senate 68-29. In the end. Congress wasted another $79 million last year for 
the program. The 2007 Senate budget resolution promises to fund the program at 
almost twice that amount. 

It would be one thing if we were operating in a surplus. Then we could have a 
legitimate debate about whether to keep funding failing programs hoping they will 
improve or to give that surplus back to the taxpayers. But that’s not where we are 
today, with a debt burden of $25,000 per man, woman and child in America. We 
simply cannot afford to keep funding programs that cannot prove their worth. 

Nondefense discretionary spending has increased over 45 percent since 2001. The 
President has requested a $2.8 trillion budget and that doesn’t include any so called 
“emergency” supplemental spending bills in our future, nor does it include the late- 
night pork-barrel frenzy each time Congress schedules an Appropriations bill vote. 
Entitlement spending will tank our economy if we don’t do something to get spend- 
ing under control. 

The question remains, how do we get Congress to act? I would like to see 0MB 
sell their PART and Terminations List more aggressively: 

• Forcefully sell these reforms and savings to Congress. 

• Fight for these cuts, by taking the terminations list to the American people 
with the power of the bully pulpit. 

• The President should veto spending bills that continue to issue blank checks 
to failing programs. 

There’s a bit of hope on the horizon — I was encouraged to see that the House Ap- 
propriations Committee wrote in their 2006 Budget Savings report that “the only 
way to establish accountability in the budget process is to stop spending on pro- 
grams that have outlived their usefulness or could be delivered more effectively at 
the State or local level.” I’ll believe it when I see it, but I welcome any help we can 
get. 

The best place to start is by immediately defunding all programs on the Termi- 
nations List and adopting the other PART reduction recommendations. Granted, the 
list only cuts $20.4 billion from a $2.8 trillion budget, but we’ve got to start some- 
where. What’s more, we should suspend the creation of any new program until fur- 
ther notice. We need “sunset” legislation that would phase out every single govern- 
ment agency, department or program after a certain deadline if the Congress fails 
to act or if the program consistently performs poorly. These are challenging times 
and we can no longer budget on cruise control. 

I want to thank our witnesses for being here today and for the time they spent 
preparing testimony. 

Again, I apologize for the lateness of our attendance, and Senator 
Carper, you are recognized. 
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OPENING STATEMENT OF SENATOR CARPER 

Senator Carper. Thank you. 

Senator Coburn. And I have already explained that you will 
probably have to attend the briefing that is ongoing. 

Senator Carper. Thanks, Mr. Chairman. To our witnesses, wel- 
come. It is good to see each of you. We appreciate you joining us 
and providing your testimony today. As the Chairman mentioned. 
Secretary of State Rice and Secretary Rumsfeld are briefing us as 
we speak over in the Capitol, and I want to slip out in a little bit 
and hear what they have to say and hopefully rejoin you before you 
leave. 

Mr. Chairman, thank you for holding this hearing. It is an im- 
portant hearing, as we both know. And as we have discussed in 
any number of our similar hearings in the past over the last couple 
of years, our country is facing a large budget deficit for as far as 
the eye can see, and we are just about to embark on another appro- 
priations season here in Congress, where we will be called on to 
make some difficult decisions about what to do with relatively 
scarce Federal resources. 

At the same time as GAO and other observers have pointed out 
again, and again, we are at a crossroads in our history, where we 
need to decide what we want our government to do in the 21st 
Century. Nearly 5 years after the attacks of September 11, 2001, 
we have a whole new set of needs, a whole new set of priorities 
that must be balanced against some of our older needs and prior- 
ities in scores of popular programs. And with the challenge of retir- 
ing baby boomers, guys like me, our generation on the horizon, we 
just cannot afford to do all of the things that we might want to do. 

That is why initiatives like OMB’s Program Assessment and Rat- 
ing Tool (PART) are interesting and, I think, important. We should 
never be afraid of taking a hard look at Federal programs, my pro- 
grams, Senator Coburn’s programs, whatever, to determine wheth- 
er or not they are accomplishing what was intended for them to ac- 
complish when we first created them. And in this day and age, we 
simply cannot afford to allow either poorly conceived or poorly 
managed programs to continue without reform or, frankly, for a 
program that has run its course and achieved its goals, to continue 
draining resources from other, newer priorities. 

That said, we need to be certain that PART or whatever mecha- 
nism we use to make these evaluations is in itself effective. I think 
to be effective, a program like PART must be totally separated 
from politics and ideology, at least to the extent we can make that 
happen. It must be closely coordinated with existing mechanisms 
agencies and Congress use to align the budget with program goals 
and outcomes such as the older government Performance and Re- 
sults Act. And perhaps just as importantly, we also need to make 
sure that a program’s intended beneficiaries outside of Washington 
have a say before an evaluation is actually completed. 

Let me just add in closing, if I could, Mr. Chairman, that we are 
not going to close the budget deficit, we know, by reducing spend- 
ing on a program here or eliminating a program there, although 
every little bit helps. But even if a program were to eliminate every 
single one of the programs receiving failing grades through PART, 
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I still think the savings would cover just a fraction of our budget 
deficit, but they would cover a portion of our budget deficit. 

Non-defense discretionary spending, which is the target of many 
of the spending reductions and program eliminations in the Presi- 
dent’s budget proposals, make up a relatively small percentage of 
the Federal budget. I am sure we can find ways to improve the 
management of some of the funding in that 16 percent or even to 
find and eliminate waste and inefficient use of resources within 
that 16 percent. 

If we truly want to tackle the fiscal problems facing us right now, 
however, we, and that is the Congress and I think the Administra- 
tion needs to take a look at the entire budgetary picture. We need 
to look on both the spending and on the revenue side, and we need 
to make some tough choices. 

Thank you, Mr. Chairman. Again, we look forward to your testi- 
mony today. 

Senator Coburn. Thank you. Senator Carper. 

I am going to ask the witnesses to limit their verbal testimony 
to 5 minutes. Your complete written statements will be made a 
part of the official hearing record, and we will hold our questions 
until you have given your testimony. 

Let me first introduce Clay Johnson III, Deputy Director for 
Management at 0MB, and in his capacity, he has provided the gov- 
ernment-wide leadership to the Executive Branch agencies to im- 
prove agency and program performance. Formerly, he served as As- 
sistant to the President for Presidential Personnel, responsible for 
the organization that identifies and recruits approximately 4,000 
senior officials, middle management personnel, and part-time board 
and commission members. At 0MB, he oversees PART process. 

Eileen Norcross, Senior Research Fellow, Government Account- 
ability Project, The Mercatus Center at George Mason University; 
she joined that center as a research fellow in January 2003. Her 
research areas include the U.S. budget, the use of performance 
budgeting in the Federal Government, tax and fiscal policy, and en- 
vironmental regulation. She is one of the leading experts on per- 
formance-based budgeting, and her scholarship plays a vital role in 
the debate on PART and the importance of measuring outcomes. 

Adam Hughes is the Director for Federal Fiscal Policy at 0MB 
Watch. He oversees Federal budget and tax policy, income and 
wealth trends, and government performance issues at 0MB Watch. 
Senator Carper and myself very much appreciate the work that 
0MB Watch has done in their pursuit of transparent and account- 
able government and for the support of the Federal Funding Ac- 
countability and Transparency Act that we both authored. This bill 
would create an online public database that itemizes Federal fund- 
ing so taxpayers can see how their money is being spent. 

I want to welcome you all. I will recognize Mr. Johnson first. 
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TESTIMONY OF CLAY JOHNSON III,i DEPUTY DIRECTOR FOR 
MANAGEMENT, U.S. OFFICE OF MANAGEMENT AND BUDGET 

Mr. Johnson. Mr. Chairman, Senator Carper, thank you very 
much. 

The title of this hearing is Will Congress Ever Respond to Pro- 
gram Performance Data? In preparing my response, I rephrased 
that to Does Congress Care Whether Programs Work or Not? My 
answer is “I am not sure, but I sure hope so.” I believe that tax- 
payers want Congress to ensure that they, the taxpayers, get what 
they pay for. I believe that we all, to widely varying degrees, how- 
ever, want Federal programs to do what they are supposed to do 
and get better every year. 

I believe that money is tight, as you all have pointed out, and 
the biggest opportunity we have to add new services and expand 
existing services to more citizens is through causing our existing 
programs to work better, not spend more money. I believe that ca- 
reer employees want to be held accountable for how their programs 
perform. They tell me this in focus groups. And I also believe that 
career employees care about how their programs perform. 

Because of this, I believe it is important to have certain things. 
I believe it is important to have clear outcome goals for each Fed- 
eral program. We do not have that now. I believe it is important 
to have Federal program performance information that is objective, 
as objective and reliable as possible. I believe that we need to have 
lots of transparency about how well programs are performing. If we 
do all of this in the dead of night, it cannot be used to hold people 
accountable. 

I believe that we need lots of debate about these performance as- 
sessments and how to make them better. As you said, Mr. Chair- 
man, program assessment is going to be a blunt instrument, par- 
ticularly in the early years. And it will only get better every year, 
but a blunt instrument is better than no instrument at all. I also 
believe it is important to have lots of discussion about how to help 
programs work better. We talk a lot about using the PART to make 
budget decisions. I believe the primary use of PART information is 
to help programs get better. If we cut programs, we might save $10 
billion here or $15 billion there per year. If we cause 1 percent im- 
provement in program performance each year, that is $28 billion a 
year. Two percent is obviously twice that. 

After 5 years of effort, not 5 months, comprehensive program 
performance information is still time consuming and very hard to 
come by. We have program outcome goals, performance informa- 
tion, and lots of transparency, which other countries and several 
States are working to adopt, and most good government groups ap- 
plaud. What we do not have from most Members of Congress is a 
lot of constructive debate about these assessments and how to im- 
prove and use them to improve program performance. We have 
asked for feedback. We have asked for engagement by Congress but 
have not gotten it. 

Currently, a majority of Appropriations subcommittees have no 
objection to the way agencies use performance information to jus- 


^The prepared statement of Mr. Johnson with attachments appears in the Appendix on page 
25 . 
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tify their budgets. Some of these subcommittees actually use the 
PART to justify program funding in their bills. A few Members of 
Congress have advanced greater use of performance information in 
decisionmaking. Congressmen Platts and Tanner have proposed 
separate pieces of legislation, while Senators like you, Senator 
Coburn, and Senators Carper, Ensign, and Allard have spoken out 
on the subject, and Congressmen Cuellar, Conaway, and Diaz- 
Balart have spoken out on it as well. 

But these expressions of interest in program performance are the 
exceptions. There is a big, a huge opportunity for Congress to chal- 
lenge programs to clearly define success and their plan for achiev- 
ing it, and then to hold agencies accountable for doing what they 
said they were going to do. 

That concludes my remarks, and I look forward to any questions. 

Senator Coburn. Ms. Norcross. 

TESTIMONY OF EILEEN NORCROSS, i SENIOR RESEARCH FEL- 
LOW FOR THE GOVERNMENT ACCOUNTABILITY PROJECT, 

THE MERCATUS CENTER AT GEORGE MASON UNIVERSITY 

Ms. Norcross. Thank you. Chairman Coburn, Senator Carper, 
for inviting me to testify today on Autopilot Budgeting: Will Con- 
gress Ever Respond to Government Performance Data? Our work 
in the Government Accountability Project at the Mercatus Center 
at George Mason University focuses closely on performance infor- 
mation in government, and I note that the views expressed in my 
testimony are not an official position of the university. 

I would like to submit for the record our paper on the results of 
the fiscal year 2007 PART for your reference. 

Senator Coburn. Without objection, the document will be in- 
cluded in the record. ^ 

A program is a tool to achieve a policy goal. Do economic develop- 
ment programs lead to prosperous communities? Are homeland se- 
curity programs protecting the Nation? Congress needs to know the 
answers to these questions in order to make decisions about how 
to spend resources. Without performance information. Congress 
cannot reliably accomplish its policy aims. Not knowing its con- 
sequences, Congress has created anywhere from 180 to 342 pro- 
grams dealing with economic development in over 24 agencies; 44 
job training programs in nine agencies. 

Program duplication on this scale tells us that Congress is not 
sure which programs are reaching their goals. It has no way of 
comparing programs around common outcomes. Not knowing if a 
job training program is employing people means not spending 
money on programs that are employing people. Not evaluating pro- 
grams on a regular basis prevents the program from effectively 
reaching grantees or delivering results; performance information 
from its dialogue between agencies, the Executive Branch, and 
Congress around jointly defined objectives. 

Congress took the initiative in 1993, when it passed GPRA. 
GPRA has encouraged the development of performance measures 
and data, but it was not until OMB’s Program Assessment Rating 


^The prepared statement of Ms. Norcross appears in the Appendix on page 49. 

2 The Working Paper in Government Accountability appears in the Appendix on page 61. 
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Tool that real progress towards developing measures was made. 
That is because the Administration does not just require the infor- 
mation; it uses it. Congress has identified the need for performance 
information. It must now commit to using it. Otherwise, measuring 
and gathering data is a paper exercise. 

For the past 2 years, the President has issued a major savings 
and reforms report detailing his reasons for terminating or reduc- 
ing funding for programs. Of the 154 recommended for termination 
or reduction in funds last year, 54 were PARTed. The document in- 
dicates where PART played a role. Other factors include lack of a 
Federal role, obsolescence, or completion of mission. 

The Administration uses PART along with other information and 
does not limit itself to the evaluations. It does not automatically re- 
ward satisfactory programs or cancel underperforming ones. By 
contrast, the House Committee on Appropriations report “On Time 
and Under Budget” lists 53 programs that were terminated. It only 
offers explanations for three of the terminations. We do not know 
if the remainder were terminated because they were underper- 
formers or politically easy choices. 

The Administration’s report gives a rationale for each rec- 
ommendation. The House report only provides a list. Ultimately, 
the goal is not to randomly kill programs. Making judgments about 
how to fund agency activity should be constructive, not destructive. 
Performance information helps make policy effective. We want to 
know what works, what does not, and why. 

The only way to give budgetary decisions credibility is to base 
them on a reliable evaluation of their performance. Is PART that 
system? PART’S methodology has been criticized. Improvements 
can and should be made. But what is important about PART is not 
the ratings; it is the Management 101 questions PART asks of 
agency activity. Is the program purpose clear? Is it effectively tar- 
geted? Has it demonstrated progress towards its goals? These ques- 
tions are the substance of PART. These are the questions Congress 
should be asking before allocating resources. 

PART has a few virtues. It has identified and catalogued agency 
activity. It is transparent. It holds programs accountable to the 
same standards. It measures outcomes. Once strength often cited 
as a weakness: PART rates programs on statutory limitations. 
Though a source of frustration for agencies, here, PART provides 
a service by identifying those aspects of a program that are bar- 
riers to success. The hope is that Congress review the statute to 
see if it is preventing the program from meeting its objectives. 

Some limitations of PART: It rates programs against their own 
performance. We would like to see PART advanced to compare like 
activities. In some cases, scores may not fully reflect program per- 
formance, and there is a potential for different budget examiners 
to reach different conclusions. We do not believe Congress should 
adopt PART wholesale. We hope Congress would consider using the 
kinds of questions PART is asking as the basis of developing its 
own method of evaluating agency activity based on common out- 
comes. 

Indiscriminate cancellation of programs discredits the budgetary 
process. We leave program managers confused about why their pro- 
grams failed. Programs need to deliver according to clear expecta- 
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tions and be given a chance to perform. When you do not meet the 
expectations, reduction in funding or termination should be the re- 
sult. It should not be a surprise. 

We believe performance information is best used in conjunction 
with other criteria. All of these form the basis against which Con- 
gress should continually scrutinize agency activity. Efforts to ad- 
vance what PART has set in motion can only aid Congress in its 
work and give the American people confidence that our Nation’s 
problems are being solved. Thank you. 

Senator Coburn. Mr. Hughes. 

TESTIMONY OF ADAM HUGHES, i DIRECTOR OF FEDERAL 
FISCAL POLICY, OMB WATCH 

Mr. Hughes. Chairman Coburn, thank you for having me here 
today and for holding this hearing. As you mentioned, I am the 
Federal Fiscal Policy Director at OMB Watch. OMB Watch was 
founded in the 1980s and has spent over 20 years advocating for 
government accountability, transparency, and access to government 
information, and citizen participation in governmental processes. 

OMB Watch believes citizens must take an active role in holding 
their government accountable and that the Federal Government, 
when supported by sensible fiscal policy, can develop effective pro- 
grams and safeguards that meet the public’s needs. 

The issue of government performance, as you mentioned earlier, 
has taken on added importance during the Bush Administration, as 
a combination of factors, some avoidable and some not, have 
plunged the Federal Government into debt. Large and sustained 
deficits over the past 5 years have made efficient use of govern- 
ment resources all the more important. 

In light of the anticipated budget crunch due to the baby 
boomers’ retirement over the coming decades, the fiscal situation in 
this country will only deteriorate further. Performance measure- 
ment can therefore become a particularly attractive alternative for 
those who want to set Federal priorities based on the current fiscal 
prospects of a strained and shrinking revenue base. 

OMB Watch has been commenting on government performance 
issues for the better part of its existence. We have spent increased 
time and resources analyzing the Government Performance and Re- 
sults Act and the Program Assessment Rating Tool over the last 10 
years, as government itself has spent more time focusing on per- 
formance and results. We are strongly supportive of improving the 
Federal Government’s capacity to meet the public’s needs. OMB 
Watch has worked to protect and improve that capacity, and we 
have been open to the possibility of using performance measure- 
ment as one means for achieving those ends. 

We bring a strong belief in the importance and potential of gov- 
ernment itself to the work we do, and because of that belief, we, 
perhaps maybe more than anyone else, want government to be re- 
sponsible to community needs, spend money wisely, and accomplish 
its goals. We are advocates for government and therefore have a 
strong motivation to see government programs succeed. 


^The prepared statement of Mr. Hughes with attachments appears in the Appendix on page 
89 . 
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PART, however, is a very poor mechanism for measuring pro- 
gram performance and results, introducing biases and skewed ideo- 
logical perspective into a model claiming to present consistent and 
objective performance data and evaluations of government pro- 
grams. Oftentimes, the PART actually decreases the efficiency and 
effectiveness of government through increased administrative bur- 
dens, distracted managers, and compliance costs. 

Ironically, we feel the PART mechanism itself does not produce 
the right type of results to support and improve government. We 
believe PART ratings should not be directly connected to the budg- 
eting process of Congress because of significant deficiencies within 
the mechanisms, namely, the substantial biases and limitations 
embedded within the tool and the additional distortion and manip- 
ulation we have observed in OMB’s actual application of the PART. 

Based on our studies of the PART and our longstanding commit- 
ment to open, accountable government that is responsive to the 
public’s needs, I would like to make three points today. First, we 
feel the PART continues a troubling trend we have seen in other 
recent Executive Branch proposals and even some Congressional 
proposals, namely, a trend towards increasing the power of the 
White House and the Executive Branch even into some areas that 
have been constitutionally designed to be committed to Congress. 

Second, the PART is a limited and distorted tool that should not 
be used for either management of programs or for budget and ap- 
propriations decisions. In both the design of the tool and the proc- 
ess by which the tool is implemented, PART systematically ignores 
the reality and the complexity of Federal programs and judges 
them based on standards that are often deeply incompatible with 
the purposes those very programs are expected to serve. As one 
agency contact memorably explained to us, PART assessments are 
akin to a baseball coach walking to the mound to remove his star 
player and then chastising him for not kicking enough field goals. 

My third point is that there is a better way. Specifically, Con- 
gress already has the means to investigate and produce far more 
sophisticated analyses of the usefulness, effectiveness, and results 
of government programs in a deliberative way, including the oppor- 
tunity for input from a wide array of stakeholder interests. The 
openness of the Legislative Branch allows the Congress to be in- 
formed and make better decisions, but it also serves to balance 
competing agendas and perspectives from both inside and outside 
Congress. 

The oversight and evaluation process is one of the primary if not 
the primary role for the Legislative Branch. While the oversight 
function of Congress may not be as robust as it once was because 
of significantly shorter legislative sessions and delays due to sharp- 
ly divided political climates, the capacity to judge the results of 
government programs already exists within the existing structures, 
structures that we feel do not carry the significant limitations, 
biases, and negative consequences of the PART. 

In conclusion, we all agree that everyone in government, the 
President, agencies and departments, and their staffs, and espe- 
cially Congress, needs to be focused on achieving results in a fair, 
effective, and balanced way. However, this job should most of all 
fall on Congress, which already has the necessary tools and re- 
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sources in place to do the most robust and equitable review of the 
entire Federal Government. 

Relying too heavily on PART ratings will not only gradually re- 
move Congress from its funding and oversight responsibilities but 
will also continue to close the door on opportunities for outside 
stakeholder interests, the views of the public, to be infused into the 
Congressional budgeting and evaluation process. The limited per- 
spective of the PART is one of its most glaring deficiencies. While 
subjectivity and bias will almost always creep into any rating sys- 
tem, the PART does not have a mechanism for balancing out the 
results of its one-size-fits-all. Executive Branch-focused perspective. 

While the expansion of the Executive Branch powers has been 
present in our government since the turn of the last century, the 
overreach of those powers into areas historically and constitu- 
tionally given to Congress, the structuring of programs, appro- 
priating and authorizing of revenues, and oversight of government 
is a disturbing trend. Because of this, PART should not be taken 
with just a grain of salt or even a hefty dose of skepticism. Unless 
the tool design and implementation systems are significantly modi- 
fied, the PART ratings should probably be largely ignored by Con- 
gress. 

Thank you. I look forward to your questions. 

Senator Coburn. Thank you. 

I wonder if either of you might want to comment on Mr. Hughes’ 
testimony. It is certainly different than what we heard from either 
Mr. Johnson or Ms. Norcross, and I have several questions for Mr. 
Hughes as well, but I thought — Mr. Johnson, would you like to 
comment? 

Mr. Johnson. Yes; several other countries around the world 
think the PART is great; other States in America think the PART 
is great. Most good government groups think it is great. It is an 
instrument. It has had blunt; will get better every year. 

Most people that observe Congress, that have been around Con- 
gress a long time, believe that the Executive Branch is more inter- 
ested in how well programs work than Congress is. David Walker 
has said that in hearings; so have Dick Armey and others. I would 
bet you agree. It is very hard to produce performance information 
and program assessments. What the Administration has done with 
PART is a place to start. We have been working 5 years on this. 
I do not believe Congress is going to invest 5 years to put tog:ether 
the information that we have right now. The PART information is 
a starting point for building better mechanisms to holding agencies 
and programs accountable for what they do. 

So I believe, in spite of its flaws, that PART is an excellent tool. 
It is a wonderful beginning. It is the product of 5 years of effort. 
I do not see this as a power grab by the Executive Branch. The 
subject of this hearing is why won’t Congress pay attention to 
PART, so I don’t think Congress is actually reeling with this on- 
slaught of performance information from the Executive Branch. 
Our challenge is to get them to pay attention to it. 

Senator Coburn. Thank you. Ms. Norcross. 

Ms. Norcross. What is the alternative to not using performance 
information? PART has given us — at least we have moved the dis- 
cussion away from the policy preferences of an administration to- 
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wards evaluating programmatic activity. So I do not know what the 
option would be. Should we revert back to a system where we sim- 
ply do not use performance information, expect it, gather it, or ana- 
lyze it? And if there is discomfort with 0MB performing these as- 
sessments, perhaps Congress should undertake that. 

I understand Congress only engages about 7 percent of its time 
in oversight. So the current legislative mechanisms that are sup- 
posed to be engaged in this activity are not working up to speed. 
So that would simply be my response is if Congress is supposed to 
be evaluating these programs, where is the evidence that it is, in 
fact, evaluating them and providing guidance to agencies along the 
lines of performance? 

Senator Coburn. All right; thank you. 

Mr. Hughes, you mentioned that there is significant bias and dis- 
tortion and manipulation. Would you give me examples of bias, 
please? 

Mr. Hughes. Sure. There are a number of different types of bi- 
ases that can be involved in this. One is the perspective of the 
0MB officer. The budget officer at 0MB is the person who has the 
final say on what the language will be for the answers to the ques- 
tions, how that language that is written will translate into a yes 
or a no or a few of the modified answers that are possible now 
under the PART and also how those yeses and noes get translated 
not only into the numeric raw score but also into the actual rating. 

There are a lot of inconsistencies between the guidelines that 
have been laid down for what raw score equals what rating and 
what the programs that have been reviewed actually get. That is 
one type of bias, and that is from a kind of implementation per- 
spective. There are other biases in the actual design of the tool. I 
think that the format under which it was designed, which was de- 
signed to be accessible to people who may not be policy experts or 
who want to just know, like you say, come and look and see wheth- 
er the government is getting results and whether the program is 
working, that necessitates that certain things are left out. 

One of those things is whether the Congress has designed a pro- 
gram to have multiple goals. Many programs in the Federal Gov- 
ernment are designed to have multiple goals. That sort of thing is 
not taken into consideration within the PART. Oftentimes, those 
goals can be conflicting. That does not necessarily mean that it is 
a bad design. That just means that it is a complex program. And 
that kind of complexity is lost in the way the tool was designed to 
apply to people who may not be policy experts. 

Senator Coburn. Are you saying that there could be another 
PART program that would take into account for that? What is 
wrong by demanding a clear program mission from agencies? 

Mr. Hughes. Certainly nothing. 

Senator Coburn. And questioning how a program fulfills that 
goal; is there anything wrong with those two things? 

Mr. Hughes. No. 

Senator Coburn. So you do not disagree that a PART program 
might be designed better to take out more bias, but you do not dis- 
agree with the fact that knowing what a program’s goal is and 
measuring performance against that goal, it should be an effective 
tool. You would not disagree with that? 
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Mr. Hughes. No, theoretically, I agree with you. 

Senator Coburn. The one problem I had with your testimony is 
the problem I have with the rest of Congress is we are lazy. And 
the fact is that this is the 37th oversight hearing of this Sub- 
committee. Go find another one that has done that. And the point 
is that ideally. Congress does have the responsibility, but they do 
not live up to it. And so, what we are working with is in a vacuum, 
is Congress ideally should be doing this. I do not disagree with you, 
but they are not. 

And to have a blunt tool that is getting better, even though it 
can be criticized, and I think Mr. Johnson would agree that it is 
subject to some criticism, as is any assessment tool when you first 
start using it. But to say we should not have them doing it because 
it is Congress’ role — I agree; that is why I am doing it; that is why 
we have done 37 of them, begs the question of how do we motivate 
Congress to do oversight? 

So if we are critical of this one, answer me the question how I 
motivate my peers to do the appropriate thing when it comes to au- 
thorizing a program, and in that authorizing, saying we are going 
to measure it and then having the incentive to have Congress do 
the oversight to see whether or not they have a goal, and they are 
meeting that goal. 

Mr. Hughes. That is, of course, a very difficult question, one that 
I will probably be very insufficient in answering, giving a satisfac- 
tory answer for. I think that the oversight role of Congress, and 
you are correct, of course, in citing the fact that Congress does not 
really do oversight any more. That is indicative of larger things 
about our political system, about the way that the electoral process 
works, about the importance of fundraising. There are multiple 
things that are in there that actually have nothing to do with 
whether Congress should do oversight or not that are enormous 
problems that would be difficult to tackle. 

I think your leadership on this issue is important. I think we 
need to have more folks in Congress who are paying attention to 
these sorts of issues. I do not know if there is a magic bullet proce- 
dural change or a statute or something that we could do that would 
make it so that Congress would be forced to do oversight more. I 
do think that some of the suggestions that have been made in front 
of this Subcommittee in the past about taking the Program Assess- 
ment Rating Tool or a modified version of it outside of the Office 
of Management and Budget, perhaps maybe having the Govern- 
ment Accountability Office do it or establishing a committee within 
Congress that would provide oversight in that regard. I think those 
ideas are worth exploring. 

I do not think that you can just remove the PART the way it ex- 
ists now and give it to GAO and have it work well. I think there 
are design flaws that need to be corrected, that need to be ad- 
justed, and I am, of course, sympathetic to the point that if you 
change it too much, the previous reviews would not be as useful. 
But it is not necessarily just a problem with the way that the tool 
gets done at 0MB. We think there are deficiencies within the way 
it was designed as well. 
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Senator Coburn. Well, you would be agreeable, then, to submit 
to this Subcommittee the things that you think are deficient in the 
design so that we can look at that? 

Mr. Hughes. Sure, and that was refiected in my written testi- 
mony. There is a section on that. 

Senator Coburn. One of the problems with oversight is that a lot 
of agencies do not respond to our questions. Let us say we had 
oversight, and they do not respond. The only way you can solve 
that is either have somebody who can squeeze them on their 
money, or we have to squeeze them until they respond. But that 
requires the sausage-making process to be able to accomplish that. 

The thing that is disconcerting is I have little faith that Congress 
is going to step up to the bar until they are absolutely forced to 
through a financial disaster to make the hard choices. Congress 
wants to avoid hard choices, and as long as they do not feel the 
pinch, they will not make the hard choices. And that is why 2016 
is going to be a very tough year for this country, because that is 
when the pinch starts, the big pinch. And so, having an assessment 
tool, blunt, maybe somewhat biased, maybe somewhat distorted is, 
in my mind, better than nothing at all. 

Mr. Johnson, and you may not care to comment on this, but you 
might comment on the motivation behind it: The House Sub- 
committee on Labor/HHS put a prohibition in their bill this past 
week that precludes any money from being spent on the PART as- 
sessment. Any comments on the motivation behind that or what 
you see? I am not trying to create a problem for you with the Sub- 
committee, but how did we get there? 

Mr. Johnson. Well, there is one unelected staff member who is 
opposed to the PART. He worked on the Treasury/Transportation 
bill last year and put a similar prohibition in there. He was at 
HUD before that, and he disagreed with HUD’s use of the PART, 
and he was at 0MB before that. One unelected staff member is re- 
sponsible for the provision. The chairman of the committee had no 
knowledge that it was in the bill. It is inexplicable to me that lan- 
guage like that is in the bill. That is my only comment. 

Senator Coburn. OK; one of the other things, Mr. Hughes, with 
your testimony which I find, well, less than congruent is the state- 
ment that the PART increases the White House’s power. And the 
problem with that is Congress ignores the PART assessment. We 
have been able to do nothing with the PART assessment. Even 
when I look at all of it, and I look at the agencies, and I have done 
the oversight, and I try to get somebody to do something about it. 
Congress ignores it. 

So there is not a power grab there, because Congress is not pay- 
ing any attention to it. So explain to me your reasoning behind — 
is it a potential? Because it is certainly not, in fact, acted out. 
There is no effect of the PART right now on the Congress, because 
they ignore it. 

Mr. Hughes. I actually would agree with you, and I would say 
that would probably be a poor choice of words on my part. I do 
think it is a potential problem. Let us do a for instance. Let us sup- 
pose that Congress will appropriate funds according to whatever 
the rating on the PART is. Why do we even need Congress? Let’s 
just let 0MB do it. So I think it is a slippery slope. I think that 
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particularly with respect to budgeting, we have been working more 
in trying to explore the management side of it as well, of PART, 
and the usefulness within agencies. 

I think there is more potential for a productive use of the infor- 
mation there. I do not think that you can look at a PART score and 
say, OK, well, I know how to fund programs now because of these 
problems. So the way I chose my words is probably poor. I do not 
think it is a problem right now; as you say, you are correct, that 
Congress does not pay attention to them. 

Senator Coburn. Well, but let me create a scenario for you. Let 
us say that Congress is doing great oversight on everything. We 
are sunsetting things; we are reauthorizing them; we are bringing 
them back up; we really know what we are doing and that we are 
doing a good job of that. Let’s make that assumption. That is an 
absolute lie, but let’s make that assumption. 

Would you deny the fact that the Administration should have a 
performance tool themselves to measure what the goal is of the 
program and whether or not they are meeting that goal as a man- 
agement tool to become more effective in carrying out the will of 
the Congress? 

Mr. Hughes. No; I think the problem exists when the tool that 
the Administration designs, or it does not even have to be this one, 
the Executive Branch designs portrays itself as an unbiased, objec- 
tive evaluation of how programs and management are going at 
agencies when, in fact, it is anything but that. So I do not think 
that — again, in theory, that this is necessarily a problem. But with 
this particular instance, it is kind of like a wolf in sheep’s clothing. 
You have a situation where they are saying we are doing this; it 
is systematic; it is transparent; it is on the Web; the public can 
view it; this is an innate good. 

But the kind of things that we worry about are the things that 
are not transparent within the PART, that you do not necessarily 
see up front when you look at the one-page review. That is where 
you get into a tricky situation, and it is perfectly fine for the Exec- 
utive Branch to have their own systems and whatever they like, 
but the problem occurs when they try to sell that to Congress as 
the one objective evaluator. 

Senator Coburn. But they have not. They have just said since 
you are not doing one, we are going to do one, and here is what 
we have found, and here is what our recommendation is. We still 
control the purse strings, and it is obvious from the PART assess- 
ment that Congress has totally ignored the Administration when it 
comes to evaluating programs. So that is not seen as a risk to me 
whatsoever. 

Mr. Hughes. Well, that is encouraging to hear. 

Senator Coburn. Well, they have not. 

Mr. Hughes. Well, I would say that they have not succeeded. 

Senator Coburn. I think it is very discouraging to hear, because 
they are not looking at the other as well. 

Mr. Hughes. Pair enough. 

Senator Coburn. They are paying attention to nothing and con- 
tinue it. One of the battles I have, and I will share it in the Sub- 
committee, is there are a lot of bills that I block; they are author- 
izing bills. And I go to Members of the Senate, and I say these are 
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the things that I have problems with. And they say, well, why do 
you have problems? And I say, well, you have not looked at the pro- 
grams that are already there before you authorize another pro- 
gram, and you have not said we are going to eliminate this pro- 
gram and put this one in. You are authorizing another program to 
do the same thing that is already happening without deauthorizing 
another program. 

And what I get told: Well, we do not do things that way. Well, 
the American people do things that way. Business does things that 
way. States do things that way. Why should Congress not do it? So, 
really, we are shooting the messenger here. The messenger — there 
is a vacuum in terms of oversight, and we now have an Adminis- 
tration that has attempted, whether we think their tool is good or 
not. And you do not doubt that the tool is getting better as they 
have used it? They are using a tool that is improving, that does 
have maybe some bias and does have some risk for manipulation 
in it, but the fact is it is the only thing available right now, espe- 
cially since this Subcommittee has time getting even agencies to 
come and testify before it or to give us information. 

Mr. Hughes. I will respond with two points. One, your shooting 
the messenger analogy, I think that may be part of our criticism 
of it, but our problem with it is that when the messenger leaves 
with his message, and when he gets to his destination, he is car- 
rying two different messages. There is a problem with the trans- 
mission along the way, and that is something that is important to 
realize, regardless of where the criticisms are being pointed at. 

I think the second thing is, and I sympathize with your frustra- 
tions about oversight in Congress, and that is certainly something 
that we would like to see a ton more of. I think you can kind of 
get around some of the rhetoric around what government — we have 
all these programs, and they do not do anything that is important. 
If we had more oversight, if we had more openness about what the 
government actually does, I think people will actually have a great- 
er appreciation of things. 

Senator Coburn. Right. 

Mr. Hughes. So I think our criticism — try to be focused on this 
particular instance of PART, the way that this PART assessment 
works. I do not think that it should be thrown in the garbage can. 
I think that it is very important that people in Congress and people 
in the agencies and the public know that this should be, despite the 
fact that there is not a lot going on elsewhere, this should be a 
really tiny part about evaluating how government works. That 
would be my caveat about — I am sympathetic to the fact that it is 
not going on elsewhere, but try not to latch on to it and say this 
is the tool, and this is what is going to get us there. 

Senator Coburn. Nobody has in Congress. Would all three of you 
agree that some type of assessment of goals and measurement 
against the goals changes expectations of program managers? 

Mr. Johnson. I agree totally. 

Senator Coburn. Ms. Norcross. 

Ms. Norcross. Totally correct. 

Senator Coburn. Mr. Hughes. 

Mr. Hughes. In my limited experience, I would say that is right. 
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Senator Coburn. Ms. Norcross, you have some experience with 
performance tools in New Zealand, and I also know that South 
Korea has adopted assessment programs. Could you comment on 
those two things? 

Ms. Norcross. Morris McTeague, with whom I work at the Gov- 
ernment Accountability Program, has direct experience with the 
New Zealand experience in developing performance information 
systems and applying them to remedy some of New Zealand’s budg- 
et crises. And if I could answer that question later, I could get you 
more information in specific on some of the reforms that they have 
undertaken. We are right now doing an analysis of that. 

Senator Coburn. OK. 

Ms. Norcross. So I could provide that for you. 

Senator Coburn. Mr. Johnson, the question of bias in the instru- 
ment that you use, give us an example of three or four of the ques- 
tions that PART asks about programs. 

Mr. Johnson. Well, it asks if the program has a clear definition 
of — this is not exact wording, but it asks about do you have a clear 
definition of success? Does it have a good way of measuring your 
performance relative to that? Is it meeting its performance goals? 
It asks about the quality of management the program has. Do the 
program have an efficiency goal? Is it Management 101, or it is Ac- 
countability 101? 

These assessments are put together by the agency and 0MB, not 
by 0MB alone. The agency and 0MB are supposed to agree on the 
program performance goals. Just as agencies are afraid to disagree 
with Congress, agencies are sometimes afraid to disagree with 
0MB about its assessment. But if they really disagree with the as- 
sessment, agencies can submit their disagreements to an appeals 
board that I chair and that is made up of deputy secretaries from 
four or five agencies. We get a number of appeals every year, and 
we review them. Some of them, we approve, and some of them, we 
reject. 

And we also conduct what we call consistency checks, where we 
review if the PART follows the rules we have for answering the 
questions. We also review whether the answers in a PART are con- 
sistent with each other. 

As we look also at programs dealing with the same subject across 
agencies, we pull all the relevant program assessments together 
when we start doing a cross-cut analyses to make sure we are 
equally attentive to the issues, equally focused on the quality of the 
performance measures and so forth, because we are going to be 
using this information to compare one program to another. We use 
cross-cutting analyses to see if there is something an ineffective 
program can learn from an effective program dealing with the 
same topic. 

So there is a lot of effort to make the assessments consistent; to 
make the information reliable; and to remove bias. There is bias in 
anything a human being does. So I have no doubt that these are 
not perfect instruments, but they will get better over time, and the 
assessments that we have done in the last 2 years are better than 
the assessments we did in the first 2 years. 

Senator Coburn. So, in other words, the programs set their own 
outcome measurements. 
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Mr. Johnson. The program does. 0MB and the program staff 
have to agree that the performance measures are acceptable. 

Senator Coburn. And then, they measure themselves against it. 

Mr. Johnson. They then determine the metrics they will use to 
measure performance and how to collect the data, and how often 
to collect that data. 

Senator Coburn. And so, where is the bias in that? 

Mr. Johnson. I do not know; just because 

Senator Coburn. If they are participating in setting the goal, 
and they are participating in setting the metrics, and they are the 
ones doing the measurement of the metrics, where is the bias? 

Mr. Johnson. I do not know where there is unusual bias. I know 
that there is a bias in anything that human beings are involved in. 
So I do not know what specifically Mr. Hughes is talking about. 

Senator Coburn. In teaching to the test, a problem across agen- 
cies as they respond to PART questions, Mr. Hughes wrote in his 
testimony that agency officials told him they gamed the system to 
avoid negative scores and consequences. Do you think that is true? 
Is there something in the program to help alleviate? We know ev- 
erybody games when they are being measured to an extent. Are 
there things in the PART assessment system that take that into 
account? 

Mr. Johnson. I know that agencies like to be green. They really 
like to be green, and they really like to have good PART scores. 
And so, they do a lot of things to please 0MB and to get good 
scores and to look good on that scorecard. 

Senator Coburn. Does that carry out into changed programs and 
changed management to make the programs more effective to de- 
liver better process and therefore better response by the govern- 
ment to the very people they are supposed to be helping? 

Mr. Johnson. Well, teaching to the test, gaming the PART to get 
a good score and providing only superficial analysis does not help 
the program work better. One approach that we have to improve 
the quality of program performance is shine a real big light on it 
all, which is one of the primary reasons we took all this assessment 
information, summarized it, and put it on the Website Expect- 
More.Gov for all the world to see and for people to look up and say 
that is not the way I know the program works. An employee can 
look at it, or someone served by the program can look at it and say, 
well, that program does not work very well; it is ineffective as far 
as I am concerned, and they can complain to the agency or com- 
plain to 0MB or complain to their Senator or Congressman. 

Shining a lot of light on how the program is assessed, on what 
performance measures are used, and on what the performance in- 
formation says can drive improvements in the measures that are 
used, the data that are used, and the quality assessment. So that 
is why I believe it is so important to have No. 1 on your list posted 
on this sign, which is transparency. You can have all of this infor- 
mation, but unless we shine a really big light on it, it will not get 
better over time. That is why we took it with all of its warts, with 
all of its dimples, put it out there. Now, let us begin the process 
with agencies and with Congress, I would hope, to improve this 
program performance information, to make these assessments bet- 
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ter and to make our plans to help program perform better more ag- 
gressive. 

Senator Coburn. Let me follow up on that for a minute. So Mr. 
Hughes can go to every PART assessment and via the government 
Website can look at the goals, the metrics, the measurement of the 
metrics, the response, and the rating. 

Mr. Johnson. Right. 

Senator Coburn. In other words, nothing is hidden. Everything 
that comes to develop that, that can be accessed by 0MB Watch, 
so they can see all that. 

Mr. Johnson. Right. 

Senator Coburn. Right; OK. 

Mr. Johnson. There is a one-page summary of every PART on 
the ExpectMore.gov. There are links, at the bottom to the detailed 
PART, which is multiple pages. It is written in OMB-speak and has 
historical information and more detailed information. That is the 
meat of the assessment. The summary and all the details is avail- 
able on ExpectMore.gov. 

Senator Coburn. But they can get access 

Mr. Johnson. Yes, sir. 

Senator Coburn [continuing]. If they need to as well. 

And so, given your emphasis on transparency, are we to assume 
that you are going to be very accepting of our 0MB transparency 
bill that Senator Carper and I have. 

Mr. Johnson. That is on the contracting information. 

Senator Coburn. Online grants, contracting, everything. 

Mr. Johnson. We love transparency, and we are working very ef- 
fectively with your staff to figure out 

Senator Coburn. I understand that. 

Mr. Johnson [continuing]. The best way to get there as soon as 
possible. We are big on transparency and shining the light of day 
on performance to a strengthen accountability. 

Senator Coburn. Any other comments from any of our panelists? 

Mr. Hughes. If I could just respond to some of the stuff that we 
have been talking about, the bias in the data and how the data, 
which data is important and which data is not important, there is 
a tension between outcomes and outputs in any type of perform- 
ance management initiative. The PART focuses on outcomes, which 
is certainly a good goal. We think it is more of a broad government- 
wide goal, maybe something that should be included in something 
like the GPRA, the Government Performance and Results Act. 

A lot of times, you cannot judge the effectiveness of programs 
based purely on outcomes, and I will give you a couple of examples. 
One program that is run out of, I believe it is the National Park 
Service, is an office that works as a consultant with local commu- 
nities to transform the neglected or unused areas into public space: 
Parks, playgrounds, those sorts of things. They have collected, long 
before PART came along. And another thing that you should know 
is that agencies have collected performance data before PART came 
along. It is not like they were not doing this to begin with, and 
then, all of a sudden, 0MB says you have to do it. 

Senator Coburn. Some were not. 

Mr. Hughes. Some were not; that is correct; not all of them. 



22 


They had a couple of standards by which they judged whether 
they are doing a good job in this program that acts as a consultant. 
One is through surveys with the local communities that they con- 
sult with: Were you satisfied? Did people use the parks? Did you 
like the services we provided? Another way is they used to collect 
data about based on the amount of money that they were given, 
how many square acres of parks did they create? How many miles 
of jogging trails, those sorts of things. 

Those are outputs, the second part. The survey part could be 
both. 0MB, in the PART process, wanted them to focus on out- 
comes. And one of the things that they said should be an outcome 
was. Are the people living in the community healthier? And I think 
that is a perfectly good goal. I think people should be healthier. 
But the program in the National Park Service has no way to force 
people to go and jog in the park. All they can do is say this is the 
money we got to create parks for communities. These are the parks 
we created. These are the people we worked with and what they 
thought of what we did. 

That is one instance, one example of multiple examples of the 
difference between outcomes and outputs and how certain pro- 
grams are not necessarily structured to focus completely on out- 
comes, or maybe the outcomes are beyond their control. And one 
other example I will share about bias within the PART is the Ap- 
palachian Regional Commission. This is a program that Congress 
decided to be a patchwork, to cover the holes between other pro- 
grams that were working in similar issue areas. 

Senator Coburn. I have been trying to get rid of it for 10 years. 

Mr. Hughes. I am aware of that. [Laughter.] 

And I do not have a personal perspective on the Regional Com- 
mission myself, but Congress designed this program to fill in the 
blanks, in the holes between programs, and the PART assessment 
said that this is not a unique program, which of course, it is not, 
because Congress designed it not to be unique. It was designed to 
be duplicative, because the evidence that Congress had seen at the 
time said that there are things that are being missed. 

And we can talk more about, maybe we should have just pulled 
all of the programs together and redesigned them so that the holes 
are not missed, and that is certainly something that 0MB is trying 
to do. 

Senator Coburn. The whole point is you raise the question about 
what Congress has not done so that they will do it better. And to 
say that it is a blunt — it is a blunt tool, but it raises it up to a level 
so that somebody has to now — let’s address this, and we have not 
addressed the Appalachian Regional Commission. What we have 
done is we have let it continue to do exactly what it does, and the 
danger with that is: One, we are not efficient; two, we could design 
a program that helps a whole lot more people with the same dol- 
lars; or three is we could help the same amount of people with a 
whole lot less dollars, which gives us dollars to help somebody else 
somewhere. 

So the point is that is a commission that I am very well informed 
on, and I believe even the blunt tool will show that we could be 
much wiser as Congress to make the goals of that program more 
effective. I believe outcomes is the measure. I believe the American 
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people want outcomes as the measure. But part of it is laziness on 
our part. When we write a program, and this is something I am 
critical of Congress, we ought to be very specific about what our in- 
tentions are, and we are not. We ought to be very specific about 
what we expect, and we are not. We ought to be very specific on 
how we want to measure whether we got what we expected, and 
we are not. 

So a lot of the problems do not have anything to do with you all 
in front of us; they have to do with Congress not being good legisla- 
tors so that we design a system that can be looked at later and say 
did we go after what we intended to go after? Did we accomplish 
what we intended? And did we do it in a way or within the cost 
parameters that we thought it would? 

And so, the real criticism is not at 0MB. They are dealing with 
what we have dealt them. The real criticism is for us in not being 
specific enough in terms of — and you can ask staff: When I write 
a bill, I want it all the way down to the T. I want limited discre- 
tion, because if we are going to write a bill and do not know 
enough about it, we should not be writing the bill until we get the 
information to write a bill correctly. And I do not know many peo- 
ple who would disagree with that. It is just easier to write it loose 
and let somebody else worry with the details, and that is called 
lazy legislating. 

Mr. Hughes. That has been our experience working with you on 
the transparency bill as well. I think, though, that it is not nec- 
essarily as easy as you might make it out to seem. There is another 
example: The Consumer Product Safety Commission was ruled 
down on the PART review for not using cost-benefit analysis in its 
regulatory rulemaking. It is actually prohibited by Supreme Court 
decision from using cost-benefit changes like that in their rule- 
making. 

That is not something that the program can control. And I agree 
with you that it is good that even with the Appalachian Regional 
Commission example that these things are brought up to Congress. 
But as you have said many times, with the lack of the kind of in- 
vestigatory role that Congress is playing now into how programs 
are made, we are concerned that the information that we all admit 
has some biases and those sorts of things will be taken as a snap- 
shot, and the investigation will not be done to get underneath what 
the rating is. 

So the ineffective for the Consumer Product Safety Commission, 
it should not be said we should get rid of it. But it is the bias in 
the tool that gives you the ineffective. 

Senator Coburn. But experience tells us that is not happening, 
because Congress is not paying attention to PART, and they are 
not paying attention to their own. They are ignoring them both. So 
your fear is unfounded, because we are not using it. 

Mr. Hughes. Well, we would like to be vigilant. 

Senator Coburn. We should do both, though. We should be using 
theirs plus our own, and that is the point. Outcomes, to me, is the 
measurement, not outputs. And outcomes, if we design something 
to have an outcome, then, we ought to know what that outcome 
measurement is, and then, we ought to hold agencies accountable 
to be to that outcome. And I will just give you a great example: 



24 


How about the incidence of HIV reduction in this country, which 
has not happened, and then, we spend money on flirting classes? 
And there is no connection between the two. In other words, if 
somebody is going to measure outcome, we ought to be asking why, 
with all of the money we are spending on HIV that we are not see- 
ing a reduction in the incidence of new HIV cases in this country. 

And yet, nobody is measuring the performance against that out- 
come, and that is an outcome that makes a difference in lives. It 
is not outputs; yes, we are spending a lot of money, but we are not 
measuring outcomes, and therefore, we are not getting the ability 
to make the programmatic changes that need to be made on the 
congressional side to accomplish that. 

If the court prohibits a program from operating well, OK, if it 
prohibits a program from operating well, that tells us we have a 
problem in the design of the program. 

Mr. Hughes. Well, I would disagree with that classification. I do 
not think it is prohibiting it from operating well. I think it is say- 
ing that the program needs to take certain considerations into ac- 
count when it does operate. It needs to say there are certain things, 
equity issues within programs in the Federal Government that are 
important to take a look at. It is not necessarily that the Supreme 
Court is putting up a roadblock in front of them getting the job 
done. The Supreme Court is making a value judgment about how 
the program should operate. 

Senator Coburn. Which is not the Supreme Court’s job. The Su- 
preme Court’s job is to interpret the laws and the Constitution and 
the treaties, not to tell Congress how to run the budget of the coun- 
try, and that is 

Mr. Hughes. And I would also say to you, too, that it is not 
OMB’s job to tell you how to run the budget of the country either. 

Senator Coburn. No, we agree. 

Mr. Hughes. Yes. 

Senator Coburn. You will not disagree with me on that at all. 
I believe we have abdicated our responsibility, and the reason 0MB 
is having to do this is because we have not. But I have no heart- 
burn with somebody doing it somewhere. At least we have some in- 
formation with which to make a decision. 

I want to thank each of you for being here. I would like a little 
more formal response from your organization on specifics on how 
you would definitely change an assessment tool program for agen- 
cies and what we might be able to accomplish that would limit, and 
I want that as justifiable constructive criticism so that when we 
look at PART, we can have your thoughts in detail on how we can 
assess that and maybe make recommendations. 

Mr. Hughes. We look forward to that opportunity. 

Senator Coburn. All right. 

Thank you all so much for being here. The hearing is adjourned. 

[Whereupon, at 3:43 p.m., the Subcommittee adjourned.] 
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The Federal Government Wants to be Held Accountable 

Americans deserve to have the government spend their hard earned tax dollars 
effectively, and better every year. The President, every member of Congress and 
all Federal employees need to be held accountable for getting results with the 
money they spend. 

The PART — How We Figure Out What’s Working and What’s Not 

To find out what’s working and what’s not, 0MB and agency officials work 
together to determine whether a program: 

• Has a clear purpose and a sound design; 

• Sets outcome-oriented and suitably aggressive goals; 

• Is well managed; and 

• Achieves its goals. 

This assessment is done systematically through the Program Assessment Rating 
Tool - PART". It is a set of common questions that are asked of every program, 
though it also includes additional questions for certain types of programs such as 
credit programs or competitive grants programs. The questions aim to identify a 
program’s strengths and weaknesses so agencies can better identify actions needed 
to improve the program. 
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A key element of each assessment is defining the program’s performance goals and 
determining whether they are being achieved. Performance goals are central to the 
PART. Through the PART process, OMB and agencies ensure all programs have a 
clear definitions of success and that they have outcome-oriented performance 
measures to judge their success. In order to achieve the most accurate program 
assessment, the PART process is collaborative. Agency and OMB staff work 
together and consider all available data in determining the answers to the 
questions. This supporting data is explained and cited in the detailed PART, which 
is available for public scrutiny at ExpectMore.gov. 

The answers to the questions are used to generate an overall score for the program. 
Based on numeric ranges, the overall score is then translated into one of four 
qualitative ratings: Effective, Moderately Effective, Adequate, and Ineffective. If a 
program has not been able to develop outcome-oriented performance measures or 
collect performance information to measure performance against those goals, it 
receives a Results Not Demonstrated rating. 

Whether a program is rated Effective or Ineffective, we are constantly looking for 
ways to improve its performance. Every program commits to taking steps to 
improve its performance and get more for taxpayer dollars every year. Some are 
more aggressive than others and we are working to strengthen these improvement 
plans. 

ExpectMore.gov = Transparency = Accountability 

Summary and detailed information about all assessed programs is posted to 
ExpectMore.gov, a website launched with the release of the President’s FY07 
Budget. The site is the most comprehensive source for information about programs 
we’ve assessed and their plans to improve. The purpose of this website is to 
provide easily understandable, candid information about which programs work, 
which programs don’t, and what they are all doing to improve. 

Currently, the ratings on ExpectMore.gov show that more than 70 percent of 
Federal programs are performing. A program that enhances highway safety 
provides a clear example of a program that demonstrates improved results. To 
reduce fatalities from automobile accidents, the National Highway Traffic Safety 
Administration promotes greater seat belt use among high-risk groups such as 
younger drivers, rural populations, pick-up truck occupants, 8-15 year-old 
passengers, occasional safety belt users, and motor vehicle occupants in states with 
secondary safety belt use laws. As a result, nationwide seat belt use increased 
from 73 percent in 2001 to 82 percent in 2005, which is an all-time high. This 
saves lives. 
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However, almost 30 percent of all programs are either ineffective or cannot 
demonstrate their success. A youth employment program created under the 
Workforce Investment Act demonstrates the need for improvement. The program 
awards grants for America's neediest youth to successfully transition to the 
workplace. The program is currently rated as ineffective. It does not have 
authority to target or reallocate resources to areas of greatest need and duplicates 
other programs. To remedy this problem, the Administration is working with 
Congress to gain increased authority to reallocate resources to areas of need. The 
Administration has also proposed legislation to consolidate this program with other 
Department of Labor job training grants. This will reduce overhead, ensure that 
more funds go directly to participants, and give States the flexibility to design 
processes that best serve their citizens. 

We believe the transparency provided by ExpectMore.gov creates more 
constructive dialogue about how to improve program performance, and extra 
incentive to perform. ExpectMore.gov is not targeted to Democrats or 
Republicans, liberals or conservatives. Its audience is all Americans. 

Program Assessments and the Federal Budget 

This past year, the Administration assessed an additional 20 percent of the 
government’s programs, marking the fourth year in our effort to find out what 
works, what doesn’t, and what we need to do to improve. Program assessments are 
a factor in budgeting, but they are one among many factors. No budget decision is 
made automatically based on a program’s rating. It may be that a highly rated 
program is not a priority for this Administration; therefore the President may 
propose to decrease funding for the program. A poorly rated program may need 
additional funds to address a weakness uncovered in the assessment. If we believe 
a program has been demonstrated to be ineffective and can’t be fixed, or has 
outlived its usefulness, the Administration may recommend Congress spend the 
money on higher priority programs. The attached table shows the funding 
recommendation by program rating and by program. 

This year’s budget calls for major reductions in, or total eliminations of 141 
Federal programs, saving nearly $ 1 5 billion. There are a variety of reasons for 
these reductions, primarily they were not getting results or not fulfilling essential 
priorities. Reductions in these areas do not mean Americans should expect less 
from Federal agencies or programs. On the contrary, they should expect the 
government to give them more for their tax dollars. They should expect the 
government to become more effective and efficient each year. 
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One program the Administration proposes to terminate is the Advanced 
Technology Program (ATP), a grant program for businesses that was intended to 
develop new technologies for commercial use. A PART analysis for this program 
noted that there are many non-govemmental entities investing in early stage 
technology development, such as corporate research labs, venture capital firms, 
angel investors, and universities. The program is no longer warranted in today’s 
research and development environment. Federal subsidies to industry for ATP 
projects are not appropriate or necessary, given the growth of venture capital and 
other financing sources for high-tech projects and the profit Incentive private 
entities have to commercialize new technologies. 

The Administration also proposes to eliminate the Even Start program and redirect 
funds to programs that are likely to be more effective at improving early childhood 
education including Title I. Even Start’s poor results on national evaluations over 
a number of years and Ineffective PART rating provide strong justification for 
terminating the program. The children and adults who participate in the program 
do not make greater literacy gains than non-participants. The most recent 
evaluation concluded that, while Even Start participants made small gains, they did 
not perform better than the comparison group that did not receive Even Start 
services. 

Because the National Assessment of Vocational Education found no evidence that 
high school vocational courses themselves contribute to academic achievement or 
college enrollment, the Administration proposes to terminate this program as well. 
Under the PART, Vocational Education State Grants was rated Ineffective because 
it has produced little or no evidence of improved outcomes for students despite 
decades of Federal investment. While the Administration has urged Congress to 
reform the Vocational Education program, neither the House nor Senate 
reauthorization bills adopted significant reforms to the current program. 

Americans deserve better than to have their tax dollars invested in ineffective 
programs. 

Congress and the Focus on Results 

Like the Executive Branch, Congress wants programs to work. I believe the PART 
can be useful to Congress in its appropriation, authorization and oversight of 
programs. In some cases, Members of Congress are making use of the information 
to improve programs. Even Start is a good example. In 2004, the Administration 
proposed to fund only continuation awards, based on PART findings, and to begin 
phasing out the program. In 2005, the Administration proposed termination. 
Congress provided the first funding cut for the program in 2005 (-$22 million), 
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reducing it from $247 million to $225 million. The Congress reduced the program 
further in 2006 to $99 million. 

Certainly, we can do a better job of making the information available in a form that 
is more useful to Congress. The report accompanying the Treasury, 

Transportation, and Housing and Urban Development Appropriations Act that 
recently passed the House Committee on Appropriations stated; “[MJost [budget] 
Justifications continue to be filled with references to the Program Assessment 
Rating Tool [PART], drowning in pleonasm, and yet still devoid of useful 
information.” While a harsh assessment, I agree that we can improve. We must 
do a better job of more clearly articulating our objectives, not only for programs, 
but about how we expect information about program performance to be used. We 
also must do a better job of providing information about program performance in a 
way that is useful to you, the Congress. ExpectMore.gov is a first step in that 
effort. I would be grateful for the Committee’s suggestions on how we might do 
more. 

How has the PART changed? 

Like programs, the PART process will improve over time. Although the 
Administration has tried to keep PART questions constant so the performance of 
programs can be compared over time, we have adopted changes in the PART 
process. We have implemented better information technology solutions to make 
application of the PART less burdensome and more collaborative. We review each 
newly completed PART to ensure the answers are consistent with PART guidance. 
If agencies disagree with a PART assessment, they can appeal to a panel of senior 
agency officials. These steps and others will make the PART more reliable, less of 
a burden, and hopefully, more focused on identifying what steps programs need to 
take to become more effective. 

Conclusion 

The message is simply that we want our citizens to expect more from their Federal 
government, and we want to be held accountable for how programs perform and 
how aggressively they improve. Of course, we do. 
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Written Testimony Of 

Eileen Norcross, M.A. 

Senior Research Fellow for the Government Accountability Project 
The Mercatus Center at George Mason University 

Before the 

Subcommittee on Federal Financial Management, Government Information 
And International Security of the Senate Subcommittee on Homeland 
Security and Governmental Affairs 

June 13, 2006 


Mr. Chairman and Members of the Subcommittee: 


Thank you Chairman Cobum, Senator Carper, and Members of the 
Committee on Homeland Security and Governmental Affairs for inviting me 
to testify today on “Autopilot Budgeting: Will Congress Ever Respond to 
Government Performance Data?” 


Our work in the Government Accountability Project at the Mercatus Center 
at George Mason University focuses closely on the use of performance 
information in government. I note that the views expressed in my testimony 
are not an official position of the University. 

Before beginning I would like to submit for the record our paper on the 
results of the FY07 PART for your reference. 
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I. The importance of performance information 

According to the Office of Management and Budget the federal government 
has created over 1000 programs to address a range of policy issues from 
alleviating poverty to curing dreaded diseases and protecting the nation from 
attack. Some programs have been in existence for decades and spend billions 
of dollars towards achieving their goals. 

In many cases we do not know if they are working. 

A program is a tool to achieve a policy goal. Do economic development 
programs result in prosperous communities? Do job training programs lead 
to increased employment? Are homeland security programs protecting the 
nation from attack? 

Unless Congress knows the answers to these questions, it cannot make 
informed decisions about how to spend resources. More importantly, 
Congress cannot accomplish its policy aims. 

Without information on program performance, agencies cannot meet their 
missions and goals. The public is left in the dark about whether the 
government is solving the problems Congress has identified as important. 

Not knowing has several consequences: 
a) Program duplication 

Over the years Congress has created hundreds of programs addressing a 
single outcome. There are anywhere between 180 and 342 programs 
dealing with economic development in over 24 agencies. There are 44 
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job training programs in nine agencies; 130 programs serving at-risk 
youth, and 72 safe water programs, to name a few. Program duplication 
on this scale implies that Congress isn’t sure which programs are 
reaching their goals. It has no way of comparing programs with common 
outcomes. 

b) Fewer Public Benefits 

Not knowing whether a program is performing means possibly not 
reaching those who are supposed to benefit. The real cost of an under- 
performing job training program is not merely the amount of money 
spent on the program; it is the lost opportunity to spend those funds on 
programs that are working. It is the number of people left unemployed by 
ineffectively spent dollars. 

c) A barrier to the agency 

When programs are not required to produce performance information 
they cannot know if their activities meet the program’s ultimate policy 
objective. Not evaluating programs on a regular basis means that the 
program’s statute may be preventing the program from properly targeting 
grantees, or delivering results. Performance information permits dialog 
between Congress and agencies on common grounds and a common 
understanding of joint objectives. 


II. The state of performance information: PART and GPRA 
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Congress took the initiative in getting agencies to develop performance 
measures in 1993 when it passed the Government Performance and Results 
Act (GPRA) (P.L. 103-62). GPRA requires that agencies produce three types 
of reports: strategic plans, annual performance plans, and annual reports on 
program performance. 

The annual report is supposed to give the American people accurate, timely 
information and let them assess the extent to which agencies are producing 
tangible public benefits. GPRA has encouraged the development of 
performance measures and data. But it was not until the development of the 
Bush Administration’s Program Assessment Rating Tool (PART), that real 
progress towards developing measures was made. Programs are now 
creating outcome measures because PART is holding them accountable for 
showing results. 

Requiring the information is the first step. It helps agencies articulate goals. 
It identifies weaknesses in the statute or management of the program. It 
informs the Executive in making budget recommendations. Unless Congress 
uses performance information, attempts at holding programs accountable for 
results are merely a paper exercise. 

Has PART played a role in the President’s proposed budget or in Congress? 
For two years in a row the president has issued a “Major Savings and 
Reform Report” to accompany the proposed budget. 
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For FY 2006, the president recommended that 154 programs be either 
terminated or reduced. Congress accepted 89 of the proposals partially, or in 
full, for a total savings of $6.5 billion. 

Of the 99 programs recommended for termination, Congress agreed to 
terminate 24 of them; and reduce funding for 28. 

Of the 55 programs proposed for reduction. Congress reduced funding for 37 
of them. 

Of the 1 54 programs, 54 were PARTed. The president’s document indicates 
where PART played a role in the recommendation. Other factors were also 
taken into account: lack of a federal role, obsolescence, completion of 
mission, duplication with public or private efforts, policy priorities, and 
earmarking. The administration uses PART in conjunction with other 
information and does not limit itself to the evaluations. It does not 
automatically cancel programs with poor ratings; nor does it automatically 
reward satisfactory ones. 

By contrast, Congress issued its own report on which programs it terminated 
in FY 2006. The “On Time and Under Budget” report from the Flouse 
Committee on Appropriations lists 53 programs that were terminated. We 
could only identify three programs that were PARTed: Tech-Prep Education 
State Grants, Occupational and Employment Information, and Community 
Oriented Policing Services (COPS). We do not know if PART played a role 
in Congress’s decisions to terminate these three programs. 
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The House Appropriation Committee report only offers explanations for 
three of its terminations: Jobs-in-the-Woods (obsolete or completed 
mission), National Youth Sports (activity performed by private sector), and 
U.S. Capitol Mounted Police (no clear benefit or need.) 

We do not know why the remaining 50 were terminated. Were they 
underperformers, or politically easy choices? 

It is useful to compare the two reports. In the Executive’s we are given a 
rationale for each recommendation. The House report merely provides a list. 
Ultimately, the goal is not to randomly kill programs. The process of making 
judgments about how to fund activities should be constructive and based on 
solid evidence, not destructive. If programs managers do not know why their 
program had its fimding reduced, then no one has learned anything. 

Performance information is not about how to kill programs. It is about how 
to make them effective. We have a stake in knowing what works and what 
doesn’t, and why. It is about delivering public benefits in a transparent 
manner, and ensuring that agencies know to what standards and expectations 
they are performing. 

The only way to give Congress’ budgetary decisions credibility is to base 
them, in part, or in full, on a reliable evaluation of their performance. 
Congress should use performance information in conjunction with other 
criteria; e.g., is there a federal role for the activity? Has the program 
completed its mission or become obsolete? Is this activity a national 
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priority? Does this program do the best job of addressing a particular 
problem versus similar programs across the government? 

Is PART that system? 

PART’S methodology has been validly criticized. Assigning quantitative 
scores to groups of question and aggregating the percentages into a single 
qualitative score may not fully reflect the program’s performance. To 
illustrate, The Screener Training program in the Department of Homeland 
Security received a rating of adequate. They received a 100% in both the 
purpose and design category and the planning category, and an 86% in the 
management category but only a 13% in the results and accountability 
category. An adequate rating on its face may indicate to the reader that this 
program is satisfactorily meeting the objective of training airport screeners. 
However, according to the results section, this program has not acquired 
sufficient information to evaluate its performance. 

Improvements can and should be made to the methodology. What is most 
important about PART, however is that it asks Management 101 questions 
of agency activity. 

For example: 

• “Is the program purpose clear?” 

• “Does the program address a specific and existing problem, interest, 
or need?” 

• “Is the program designed so it is not redundant or duplicative of any 
Federal, state, local, or private effort?” 
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• “Is the program design free of major design flaws that would limit the 
program’s effectiveness or efficiency? 

• “Is the program effectively targeted so program resources reach the 
intended beneficiaries and/or otherwise address the program’s 
purpose directly?” 

• Has it demonstrated adequate progress towards its long-term outcome 
performance goals? 

This kind of logical process of questioning agency activity needs to be 
continued. Congress should be asking these kinds of questions before 
allocating resources. 

The questions are the substance of PART. The ratings are based on PART’S 
methodology of quantifying the answers to these questions. Improvements 
can and should be made to the methodology, but we should not disregard the 
contribution PART is making to getting agencies to critically examine their 
activities through the lens of outcomes. 

part’s virtues are: 

1) It has identified and cataloged agency activities, giving us a common 
unit of analysis. 

2) It is transparent and accessible to the public. 

3) It is systematically conducted. PART holds all programs accountable 
to the same standards. 

4) By asking Management 101 questions of program performance, 
PART focuses agencies on measuring outcomes, not outputs. 
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5) There is one notable strength of PART that is frequently cited as a 
weakness. PART rates programs on statutory limitations. This is a 
source of frustration for agencies that are bound to follow statute, 
even when the statute may be working against the ultimate goals of 
the program. Here PART has provided a service. It is identifying 
those aspects of programs that are barriers to success. Congress 
should take up the work of reviewing program statutes and continually 
ask if they are designed to achieve the intended aims of the program. 

Some limitations and areas for improvement: 

1) PART currently rates programs against their own historical 
performance. It has not advanced to the stage of being able to 
compare like activities. Though it is attempting to do so through 
cross-cutting analyses and by asking if the program’s objectives are 
being addressed elsewhere. 

2) As we see with the Screener Training program, in some cases, the 
measures don’t fully capture program performance. 

3) Different budget examiners may reach different conclusions viewing 
the same set of data. 

Congress is not bound to use PART ratings. But by ignoring what PART 
is trying to advance. Congress is missing an opportunity to meet the goals 
of GPRA. The administration can only take the development and usage 
of performance information so far. Congress has a responsibility to both 
agencies and the public to provide clear jusbfication for its budgetary 
decisions. A systematic evaluation of their performance gives credibility 
and reliability to the budget process. 
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in. Congress’s Opportunitj': Advancing GPRA 

Recent legislative proposals indicate that Congress is aware of the 
importance of establishing a system for evaluating programs. 

Learning from PART, Congress has an opportunity to implement its own 
process to systematically review programs, based upon a logical process of 
questioning programs and holding them accountable for outcomes. 

In addition to requiring and paying attention to agency performance 
evaluations Congress needs to consider the following: 

1) Where the statute is a barrier to performance. Congress must work to 
update and change the statute so programs are able to meet their 
objectives. 

2) Congress should articulate clear expectations of programs in the 
statute, including specific outcome-based measures of progress. 

3) Outcome Based Scrutiny: Congress should be able to compare like 
programs that serve the same policy goal and ask which are producing 
results, and which aren’t. 

There are a few proposals in Congress to codify the systematic review of 
programs and get Congress to use performance information. 
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The Federal Agency Performance Review and Sunset Act introduced by 
Representative Kevin Brady (R-TX) would allow the president to give a list 
of programs to a Congressionally established Sunset Commission to review. 
This Commission then gives its findings to the president with its 
recommendations. And the president replies with his comments on those 
findings. 

Sunset Commissions give Congress the task of reviewing agency data, 
eliminating concerns of Executive political influence. 

One area for improvement is that Sunset Commissions should consider 
evaluating government activity according to common outcomes across 
agencies, rather than reviewing the activities of discrete agencies. 

For example: if the policy is to alleviate urban blight. What programs or 
tools across the federal government currently exist? Which ones work best at 
addressing the problem and achieving results? Where might we move 
resources towards investing in programs that are successful in eliminating 
blight? 

A second piece of legislation being considered is offered by Representative 
Todd Platts (R-PA), The Program Assessment and Results Act (H.R. 1 85). 
This bill would rely on OMB to conduet assessments of agencies programs 
at least once every five fiscal years. The legislation would also ensure that 
review criteria take into account programs performing similar functions. The 
results of the assessments are to be submitted to Congress. 
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This legislation identifies the need for comparing like activities and would 
advance PART as it currently exists. 

We believe both of these pieces of legislation are steps in the right direction 
each with positive aspects. 

We do not believe Congress need adopt OMB’s PART wholesale. We hope 
improvements can be made to the methodology. Or alternatively, that 
Congress might consider using the questions PART asks as the basis for 
developing its own method of reviewing government performance and use 
the data to help inform its decisions. It is not beyond Congress’ reach to 
create and administer such a system, building upon the kinds of questions 
PART is asking. 

Indiscriminant cancellation of programs discredits the process. We leave 
program managers confused about why their program failed. Programs need 
to deliver according to clear expectations. When they do not meet them, 
reduction in funding or termination should be the result. It should not be a 
surprise. They should be given the chance to prove their effectiveness. And 
we must also recognize that performance information is best used in 
conjunction with other criteria: lack of a federal role, low-performance, 
duplicative, completion of mission. All of these form the basis against which 
Congress should continually scrutinize agency activity. Efforts to advance 
what PART has set in motion can only aid Congress in the budget process 
and give confidence to the American people that the problems our nation has 
identified are being solved. 
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An Analysis of the Office of Management and Budget’s 
Program Assessment Rating Tool (PART) for Fiscal Year 2007 


Executive Summary 

With the release of the Bush Administration’s proposed budget for FY 2007, the Office 
of Management and Budget (0MB) has completed its fourth year of the Program 
Assessment Rating Tool (PART) for evaluating federal programs. Designed as a means 
of encouraging agencies to develop performance measures and data in order to show 
program results, PART is used, in conjunction with other information, to make 
recommendations in the president’s budget as well as to inform Congress about agency 
progress towards goals. 

This paper analyzes results of the PART to date and seeks to determine how agencies 
have fared over time according to PART’S methodology. To this end, we examine, 
among other things, the proportion of agency budgets PARTed as results not 
demonstrated, or lacking in performance measures or data. We also consider how PART 
ratings are related to Congressional funding levels and the executive’s funding 
recommendations. 

According to 0MB, the improvement of PART scores over time shows that many 
programs are improving in their ability to meet their goals offering relevant data and 
establishing measures to facilitate OMB’s PART evaluation. The number of programs 
rated effective has risen from 6% in FY 2004, the first year of PART, to 16%. Overall, 
the number of programs moving from results not demonstrated (that is, not providing 
enough information to be evaluated), has gone from 50% in FY2004 to 24% in FY2007. 

Those rated ineffective remain relatively steady at 4%. Some agencies have a larger 
proportion of their funding associated with ineffective scores. In particular, 22% of the 
Department of Housing and Urban Development’s (HUD) funding is rated ineffective. 
Much of this is due to the fact that 0MB rated two of HUD’s largest programs — -the 
Community Development Block Grant program ($4.1 billion), and Project Based Rental 
Assistance ($4.95 billion) — as ineffective. 

To date, 0MB has PARTed 64% percent of the budget, or $1.47 trillion. Six percent of 
the FY 2005 funding level for PARTed programs representing $143 billion falls into the 
results not demonstrated rating category. 

Last year, the president issued a Major Savings and Reform report in which he 
recommended 154 programs for termination or reduction. The administration used 
PART, in some cases, to inform these decisions. Congress accepted 89 of these proposals 
at least partially, reducing spending by $6.5 billion. 

This year, the president has again issued a Major Savings and Reform report, in which he 
is recommending 141 programs for either termination or reduction, representing $15 
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billion in spending. Like last year, the administration cited PART assessments as 
informing some of these decisions. 

A new break-down included in this year’s PART assessments isolates programs by 
“topic” or programmatic activity. According to this categorization, 47% of programs with 
an education focus are unable to show results, while 33% of foreign affairs programs are 
rated effective. The purpose of this new category is to facilitate comparison of similar 
activities across agencies. As last year, 0MB applies PART data along with other 
information to perform crosscutting analyses of research and development programs, 
federal investment programs, credit and insurance and programs that provide aid to state 
and local governments. 
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Background 

In February 2003, the Bush administration released with its proposed FY 2004 budget, a 
new method for evaluating the performance of federal programs called the Program 
Assessment Rating Tool (PART). PART represents the Bush administration’s effort to 
get agencies to report consistently on their programmatic goals and results in order to 
improve performance and facilitate funding decisions. It is one of the five initiatives of 
the President’s Management Agenda. 

PART is an element of the Administration’s Budget and Performance Integration 
initiative to link performance information to budgeting decisions, also known as 
“performance budgeting”. A performance budget is “an integrated annual performance 
plan and annual budget that shows the relationship between funding levels and expected 
results. It indicates that a goal or set of goals should be achieved at a given level of 
spending.” ^ The effort to get agencies to link budgets and performance information 
originated in 1994 with Congress’ passage of the Government Performance and Results 
Act (GPRA). 


I. part’s Methodology and Application 

PART requires that agencies submit an assessment of their programmatic performance to 
0MB over a six year period. To date, OMB has rated 793 of roughly 1000 federal 
programs it has identified. By FY 2008, OMB will have assessed all identified programs 
at least once, 

OMB bases PART ratings on program manager responses to a series of between 25 and 
30 Yes/No questions. The questionnaire includes four sections— each weighted 
differently — dealing with an aspect of program performance: purpose and design (20%), 
strategic planning (10%), program management (20%) and results/accountability (50%). 
The individual assessments for each program are provided on OMB’s interactive website, 
Exp.e c .t MQr s.g9Y. ^ 

The results/accountability section (section four) of PART receives the greatest weight. 
This section’s questions are designed to determine if the program has met or achieved 
efficiencies in its long-term performance goals and how the program compares with 
similar programs. It also asks if the program has been independently evaluated, and if so, 
what those evaluations determined. Section four also includes the program’s relevant 
performance measures and data with suggestions for improvement. 

A program may receive one of five ratings: ineffective, adequate, moderately effective, 
effective, and results not demonstrated. The latter rating means that a program does not 
have enough information (either measures or data) to be rated — not that the program is 


^ John Mercer, Performance Based Budgeting for Federal Agencies, AMS, Fairfax, 2002, p.2 

’ For a more detailed description of the assessment process see OMB’s website, http://www.whitehouse 

.gov/omb/part/index.html. 
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ineffective. It is important to note that a program could receive an acceptable rating even 
if the results information suggests the program is ineffective. This is because only 50% of 
final rating depends on results information. 

Though regarded as valuable management tool, some believe that PART’S rating of 
programs based on statutory language is unfair and does not take into consideration that 
programs are bound to operate according to the statute as designed by Congress. 
Representative Todd Platts (R-PA) has introduced legislation, the Program Assessment 
Rating Act (H.R. 185), to require that a future program rating tool incorporate 
congressional intent** — something PART does not do. Currently, PART does not take into 
consideration that a program’s authorizing statute may create barriers in achieving the 
program’s intended outcomes. 0MB argues this is intentional and is a means of 
encouraging agencies to consult with Congress on statutory language that may be 
impeding the agency’s or the program’s mission. 

Other criticisms include the claim that PART is not consistently administered and that its 
results are too subjective. Assigning a numerical score is potentially inaccurate. Different 
budget examiners may rate a program differently when presented with the same set of 
information. 

OMB has applied PART data (in conjunction with other information) to undertake 
crosscutting analyses of aspects of federal programmatic activity. These ongoing analyses 
compare programs across agencies on the basis of similar outcomes, or approaches to 
policy problems, with the intent of highlighting best practices, eliminating duplication, or 
improving coordination across agencies. These analyses include crosscuts of research and 
development programs, federal investment programs, credit and insurance, and aid to 
state and local governments. Last year, OMB applied PART data along with other 
information to analyze the performance of community and economic development 
programs across agencies. This produced the policy recommendation called the 
Strengthening America’s Communities Initiative, and the suggestion that 18 similar 
programs be consolidated under one umbrella in the Commerce Department. The 
initiative was rejected by Congress. 

Though PART scores and their application to budget decisions and policy remains the 
subject of debate in Congress and agencies, PART appears to have Increased 
Congressional interest in evaluating programmatic activity for results, improving reliable 
performance information, and advancing the goals of GPRA. 

Recent legislative efforts to codify the concept of an annual measurement of program 
performance (not the PART itself) include the Government Reorganization and Program 
Performance Improvement Act of 2005 ^onsored by Representative Kevin Brady (R- 
TX).-*’ The Act, which may come up for vote in the House during June 2006, would create 
sunset commissions to periodically review and phase out government programs that are 
obsolete, dysfunctional, duplicative, or unable to meet their goals. 


“ “OMB program assessments viewed as flawed budget tool” by Jenny Mandel Govexec. com, April 4, 2006. 
^ A Senate version has also been introduced. 
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On May 25, 2006, Representative John Tanner (D-TN) introduced legislation. House 
Resolution 841, to hold Congress accountable for how it spends tax dollars. Provisions 
include requiring Congress to hold at least two hearings a year on performance reviews 
produced through PART. 

Related to increased interest in the performance of federal dollars, the Federal Funding 
Accountability and Transparency Act (S. 2590) introduced by Senators Tom Cobum (R- 
OK) and Barack Obama (D-Ill.) in April 2006, would establish a public database to track 
the usage of federal grants. 


II. Study Purpose and Previous Analysis 

This study is an annual update of an analysis we undertook last year in order to examine 
the progression of PART scores over time, to classify the percentage of the federal 
budget represented by particular program ratings, and to explore the relationship between 
PART scores and appropriations. This study does not consider whether PART is affecting 
agency or legislative behavior and funding decisions. Rather, it describes correlations and 
trends in PART scores. 

For the purpose of this analysis, we take PART ratings at face value. But that does not 
mean we necessarily agree with the methodology used or the conclusions arrived at in the 
individual assessments. 

Many of the questions PART asks of agencies are valuable by themselves in that they 
focus program managers on their core missions and accomplishments, and areas that need 
improvement. However, assigning quantitative scores to groups of questions and then 
aggregating the percentages into a single qualitative score may not fully reflect the 
program’s performance. For example, a program may receive a perfect score in three 
categories; purpose and design, strategic plarming, and management, but fail in results 
and accountability, and still manage to receive a satisfactory rating. To illustrate, the 
Screener Training program in the Department of Homeland Security, received a rating of 
adequate. They received 100% in both the purpose category and the planning category, 
an 86% in the management category but only a 13% in the results and accountability 
category. An adequate rating on its face may indicate to the casual reader that this 
program is adequately meeting the objective of training airport screeners. However, 
according to the results section, this program, which is relatively new, has not acquired 
sufficient information in order to gauge its effectiveness. The PART assessment points to 
a GAO evaluation that shows the program has improved. 

Criticisms of PART should not preclude us from studying it more closely. PART 
provides the first attempt to identify, measure, and aggregate performance data across 
agencies. PART is the start of a potentially valuable data source for decision makers 
seeking to understand the effects of individual programs, agency performance in given 
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policy areas, as well as possibly providing a window for the public into budgetary 
decision making. 

Just as last year, the president’s proposed budget for FY 2007 also includes a Major 
Savings and Reforms report. This supplement to the budget uses PART scores, in 
addition to other information, to make termination and funding decisions. We also 
analyze this document to find descriptive evidence of how the administration used PART 
in the FY 2007 proposed budget. This does not imply an endorsement or criticism of how 
PART was applied in making these decisions. We have updated last year’s analysis by 
examining what Congress did in response to the president’s request to terminate or 
reduce funding for 154 programs. Additionally, we include the programs that Congress 
terminated independent of the president’s recommendations.^ 

We also examine the Analytical Perspectives of the FY2007 budget^ in order to see how 
OMB is applying PART data in making its recommendations to agencies and 
policymakers, 

1. How PART has rated programs cumulatively. 


Table 1. Cumulative program results by ratings category 


Cumulative Program Results 
FY 2004-FY 2007® 



FY 2004 

FY 2005 

FY 2006 

FY 2007 

Effective 

6% 

11% 

15% 

16% 

Moderately Effective 

24% 

26% 

26% 

29% 

Adequate 

15% 

20% 

26% 

28% 

Ineffective 

5% 

5% 

4% 

4% 

Results not Demonstrated 

50% 

38% 

29% 

24% 

Total 

234 

395 

607 

793 


’ United States House of Representatives Committee on Appropriations, “0« Time and Under BudgeU 


In this paper we refer to the fiscal year of the budget in which the PART assessments appeared. That is, 
programs evaluated in 2005 appear in the president’s FY 2007 budget proposal. This avoids confusion 
when trying to locate the PART assessments for a given year. 
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With each passing year of PART, there has been a steady decrease in the number of 
programs OMB has rated results not demonstrated. One in seven programs has improved 
its PART scores. 

The cumulative number of programs rated effective, moderately effective, and adequate 
has increased, while the number of programs rated ineffective remains the same as last 
year at 4%. OMB rated 16% of programs as effective and 28% as adequate. The later 
rating represents a 2% increase. The most significant change occunod for the number of 
moderately effective programs which increased from 26% to 29% and for results not 
demonstrated programs which dropped from 29% to 24% from last year. The 
improvement in cumulative program results may be due to a few factors: a) programs are 
improving their results information, b) evaluations by OMB are getting more, or less, 
accurate, c) OMB happens to be evaluating better-performing programs or, d) agencies 
are developing better performance measures. 


Chart 1. Cumulative program results by ratings category 



Cumulative Program Results by Ratings Category FY04-FY07 


Effective Moderately Elective Adequate Ineffective Results not 

Demonstrated 


aPY 2004 
OFY 2005 
BFY 2006 
aPY 2007 


^ See, Analytical Perspectives of the U.S. Budget, FY2007, p. 15. 
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2. Are there observable changes in program performance between FY 
2004 and FY 2007 for reassessed programs? 

0MB has reassessed 1 5 1 programs of the 793 programs it has assessed to date. Of these, 
132 have been rated twice, 18 have been rated three times and one program — Missile 
Defense — has been rated four times. 


Table 2. Ratings for reassessed programs 



Initial PART Rating 

Most Recent PART Rating 

RND 

100 

8 

Ineffective 

2 

5 

Adequate 

17 

59 

Moderately Effective 

29 

49 

Effective 

3 

30 


As last year, the greatest improvement among programs that have been evaluated more 
than once occurred in programs initially rated results not demonstrated. Of thelOO 
programs initially receiving this rating, only eight retained tlieir results not demonstrated 
upon their most recent reassessment. The number of reassessed programs rated effective 
increased significantly from three to 30. Of these 30 programs, 15 were initially rated 
results not demonstrated. Another significant change occurred for programs rated 
adequate. Initially 17 programs received this rating, upon reassessment, 59 were rated 
adequate. Improvements were also evident in the moderately effective category as its 
ranks increased fi-om 29 to 49 programs. 

Of the 151 programs reassessed to date, two were initially rated ineffective; OMB has 
since upgraded one of these to adequate. For all reassessed programs, five are currently 
rated ineffective; four of these moved out of the results not demonstrated category. 

3. How did programs move within ratings categories? 

The chart below shows how programs moved from their initial rating to their most recent. 
That is, of the 100 programs initially rated results not demonstrated, what is their current 
rating? Forty-three programs have moved from results not demonstrated to adequate; 15 
have moved to effective; four are now rated ineffective; eight remain results not 
demonstrated; and 30 are now rated moderately effective. Only one program has 
remained ineffective — ^the Department of Energy’s Oil Exploration and Production 
program — while four programs have moved fi-om results not demonstrated to ineffective. 
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Chart 3. How reassessed programs moved within ratings categories from first to 
most recent assessment 

Change in Rating From First to Most Recent Assessment 



4. Programs rated by program type/category 
PART classifies programs according to seven categories: 

1) Block/Formula Grants - Programs that provide fiinds to state, local, and tribal 
governments and other entities by formula block grant. 

2) Capital Acquisition - Programs that achieve their goals through development and 
acquisition of capital assets (such as land, structures, equipment, and intellectual 
property) or the purchase of services (such as maintenance, and information 
technology). 

3) Competitive Grants - Programs that provide funds to state, local and tribal 
governments, organizations, individuals and other entities through a competitive 
process. 

4) Ciedit - Programs that provide support through loans, loan guarantees, and direct 
credit. 

5) Direct Federal - Programs where services are provided primarily by employees of 
the federal government. 

6) Reg u latory , Based - Programs that accomplish their mission through rulemaking 
that implements, interprets, or prescribes law or policy, or describes procedure or 
practice requirements. 
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7) Research and Development - Programs that focus on knowledge creation or its 
application to the creation of systems, methods, materials, or technologies. 

Mixed programs are those that combine elements from two or more categories (e.g., a 
research and development program that uses grants as a means of funding research). 

Examining PART data for FY 2004 through FY 2007 reveals that certain categories of 
programs fare better than others in the ratings 


Table 4. Most recent PART ratings by program category 



RND 

Ineffective 

Adequate 

Mod. Effective 

Effective 

Block Grant (135) 

49 

11 

38 

29 

8 


(36%) 

(8%) 

(28%) 

(21%) 

(6%) 

Capital Assets (73) 

16 

2 

20 

22 

13 


(22%) 

(3%) 

(27%) 

(30%) 

(18%) 

Competitive Grant (146) 

52 

7 

46 

30 

11 


(36%) 

(6%) 

(32%) 

(21%) 

(8%) 

Credit Program (30) 

5 

1 

15 

6 

3 


(17%) 

(3%) 

(50%) 

(20%) 

(10%) 

Direct Federal (250) 

48 

4 

70 

79 

49 


(19%) 

(2%) 

(28%) 

(32%) 

(20%) 

Mixed (2) 

1 

0 

0 

1 

0 


(50%) 

0% 

0% 

(50%) 

0% 

Regulatory (57) 

13 

0 

16 

17 

11 


(23%) 

(%) 

(28%) 

(30%) 

(19%) 

R&D(100) 

7 

3 

14 

47 

29 


(7%) 

(3%) 

(14%) 

(47%) 

(29%) 


Excluding mixed programs, which account for only two programs of the 793 PARTed, 
both block grant and competitive grant programs continue to have the largest percentage 
of programs rated results not demonstrated— 36% each. And as was the case last year, 
both of these program types continue to have the largest percentage of programs rated 
ineffective, 8% and 5% respectively. 

Direct federal and research and development programs by contrast have the greatest 
percentage of programs rated effective, 20% and 29% respectively. Regulatory programs 
at 19% and capital asset programs at 18% are not far behind. 
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Cumulative Ratings By Program Category 



R & D {100) 

Regulatory (57) 

Mixed (2) 

Direct F=ederal (260) 

Credit Program (30) 

Competitive Grant (146) 

Capitai Assets (73) 

Block Grant (13S) 


0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 


Crosscutting analysis for credit programs and block grant programs 

Credit programs 

The ratings for program categories raise the question of why certain kinds of programs 
seem to operate more effectively than others. Included among OMB’s crosscutting 
analyses are credit programs. OMB’s analysis includes a detailed look at how credit 
programs perform within each of the four ratings areas (program purpose and design, 
strategic planning, management, and results.) Their analysis indicates that credit 
programs receive high scores for program purpose and design 77% on average although 
this is slightly lower than the average for all programs, 86%. Credit programs score low 
in program results (53%), yet compared to the average score for all programs, 47%, this 
is relatively high. 

In terms of program purpose and design, 0MB finds that though many of these programs 
have clear purposes, they are often duplicative of other programs or private sources, and 
have poor incentive structures, limiting their effectiveness, “For example, private lenders 
are generally better at screening borrowers, but they may not screen borrowers effectively 
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if the Government provides a 100% loan guarantee.”’® Thus, OMB suggests that these 
programs work more closely with private lending institutions. 

In the area of strategic planning, OMB states that credit programs have good short-term 
measures, but are lacking in longer term metrics, such as linking their budgets to 
outcomes, and performing stringent performance evaluations. 

OMB notes that in terms of program management, credit programs are strong in terms of 
basic finance and accounting practices, yet should incorporate more measures of risk 
analysis. 

And in the most heavily weighted category, program results, OMB states that credit 
programs are weak, despite their higher than average score. Reasons for this include the 
difficulty of measuring the net outcome of the program, that is, what would have 
happened in the absence of the program? In addition, credit programs must also 
accurately estimate cost, OMB notes that the complexities and dynamic nature of 
financial markets make credit programs difficult to measure. As private entities reach 
more underserved populations, government credit programs may have decreased results. 
Conversely, if financial markets are in turmoil, government credit programs may become 
more effective. “A sub-par review could be related to financial market developments; the 
program might have failed to adapt to rapid changes in financial markets; or its function 
might have become obsolete due to financial evolution.” ” 


Programs that provide grants to states and localities are also the subject of a crosscutting 
analysis in this year’s budget. These 211 programs are a subset of block grant, and 
competitive grant programs, representing $209.8 billion in spending in 2005. Of these 
211 programs, 41% are rated results not demonstrated, higher than the average for all 
programs (31%). OMB states that this is because grant programs have a broad purpose, 
and a general “lack of agreement among grantees and federal parties on the purpose and 
performance measures, and therefore lack of focused planning to achieve common 
goals.”'^ 

This marks the second year the OMB has been scrutinized block grant programs. OMB 
notes block grants are one of the most common tools used by the federal government, 
providing social service funding to states and localities. They are generally regarded as 
‘flexible’ in that local grantees may determine how best to use the funds. However, OMB 
states that “accountability for results can be difficult when funds are allocated based on 
formulas and population rather than achievement or needs.” Additionally, block grants 
pose performance management challenges, reflected in the high number of ineffective 
programs among block grants, 8%. 


See Analytical Perspectives of the U.S. Budget, FY 2007 p, 68 
" Op.cit. pp. 68-69 

Analytical Perspectives of the U.S. Budget, FY 2007, p. 105 
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0MB notes that it intends to continue monitoring block grant programs to highlight best 
practices, sharing successful methods with low-performing programs. 

5. PART Ratings by program topic 

This year 0MB budget examiners assigned a “topic” to PARTed programs during their 
evaluation based on the majority of the program’s activities, based on a sub-category of 
the federal budget codes. This designation may be useful since it allows cross-agency 
analysis of programs based on common outcomes. 


Table 5. Programs rated by topic 



RND 

Ineffective 

Adequate 

Mod. 

Effective 

Effective 

Agriculture (72) 

20 

1 

21 

26 

4 


(28%) 

(1%) 

(29%) 

(36%) 

(6%) 

Business and Commerce (80) 

18 

3 

25 

21 

13 


(23%) 

(4%) 

(31%) 

(26%) 

(16%) 

Community & Regional 

Development (51) 

15 

4 

18 

10 

4 


(29%) 

(3%) 

(35%) 

(20%) 

(8%) 

Disaster Relief (19) 

4 

1 

3 

5 

6 


(21%) 

(5%) 

(16%) 

(26%) 

(32%) 

Education (105) 

49 

7 

25 

10 

14 


(47%) 

(7%) 

(24%) 

(10%) 

(13%) 

Energy (69) 

8 

2 

10 

30 

19 


(12%) 

(3%) 

(14%) 

(43%) 

(28%) 


9 

0 

23 

24 

27 


(11%) 


(28%) 

(29%) 

(33%) 

Government Administration (65) 

14 

1 

20 

15 

15 


(22%) 

(2%) 

(31%) 

(23%) 

(23%) 

Health and Well-being (137) 

36 

5 

45 

37 

14 


(26%) 

(4%) 

(33%) 

(27%) 

(10%) 

Housing (34) 

10 

4 

7 

12 

1 


(29%) 

(12%) 

(21%) 

(35%) 

(3%) 

Law Enforcement (62) 

15 

1 

21 

16 

9 


(24%) 

(2%) 

(34%) 

(26%) 

(15%) 

National Security (93) 

12 

0 

15 

31 

35 


(13%) 

(0%) 

(16%) 

(33%) 

(38%) 

Natural Resources and Environment 
(150) 

34 

4 

55 

45 

12 


(23%) 

(3%) 

(37%) 

(30%) 

(8%) 

Science and Space (46) 

5 

0 

7 

15 

19 


(11%) 

(0%) 

(15%) 

(33%) 

(41%) 

Training and Employment (36) 

5 

5 

15 

10 

1 


(14%) 

(14%) 

(42%) 

(28%) 

(3%) 

Transportation (49) 

13 

1 

5 

24 

6 


(27%) 

(2%) 

(10%) 

(49%) 

(12%) 

Veterans Benefits (9) 

2 

0 

2 

5 

0 


(22%) 

(0%) 

(22%) 

(56%) 

(0%) 
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Assessing PART ratings according to topic shows that ceilain programmatic areas, across 
agencies, are getting better ratings than others. Nearly half, or 50, education programs are 
rated results not demonstrated. While more than a quarter, or 27 of 83 foreign affairs 
programs are rated effective. More than one-third, or 35, national security programs are 
rated effective. And 28% or 10 of 36 training and employment programs are rated either 
results not demonstrated or ineffective. 

The relatively poor performance of education programs may be related to the fact that 
many of these are grant programs, which as OMB has noted tend to under perform 
relative to other types of programs. 


Veterans Benefits (9) 
Transportation (49) 
Training ar^d Employment (36) 
Science and Space (46) 
Natural Resources and Environment (150) 
National Security (93) 
Law Enforcement (62) 
Housing (34) 
Health and Well-being (137) 
Government Administration (65) 
Foreign Affairs (83) 
Energy (69) 
Education (105) 
Disaster Relief (19) 
Community and Regional Development (51) 
Business and Commerce (80) 
Agriculture (72) 




0% 

20% 

40% 60% 

80% 

100% 

1 

L. . 

■ RND 

Blneffective 

■Adequate 

QMod. Effective 

□ Effective 1 



Ratings by Topic 
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13 

6. Programs rated by agency 

Some agencies have a higher percentage of programs that are rated results not 
demonstrated or ineffective than others. The agency with the greatest number and percent 
of programs rated results not demonstrated is the Department of Education at 55% or 41 
programs of 74 rated to date. Last year they were second to the General Services 
Administration (GSA), but this year GSA has seen a drop in the number of programs 
rated results not demonstrated from eight to five, or from 61% to 37%. 

Other agencies with relatively large proportions of their programs rated results not 
demonstrated include: Department of Homeland Security with 38%, Department of the 
Interior (37%), Housing and Urban Development (32%), Department of Agriculture 
(27%), and Health and Human Services (27%). 

Housing and Urban Development has a high percentage of programs rated ineffective at 
16%. Department of Labor follows with 14% or four of its programs rated ineffective. 
The Environmental Protection Agency also has four programs rated ineffective, or 9%. 

The highest rated agencies include the National Science Foundation with 100% of its 
programs rated effective. The Nuclear Regulatory Commission also has a high percentage 
of its programs rated effective at 80%. Other highly rated agencies include: Department 
of State (50%), Department of the Treasury (38%), NASA (22%) and Department of 
Transportation (20%). 


” OMB includes a category for smaller agencies called “Other.” We have extracted the five CFO agencies 
from this categorization for this analysis: Social Security Administration, General Services Administration, 
Nuclear Regulatory Commission, Office of Personnel Management and USAID. The remaining agencies in 
the other category include the following; Consumer Product Safety Commission, Corporation for National 
and Community Service, Office of National Drug Control Policy, Export-Import Bank of the U.S., 
Tennessee Valley Authority, Federal Communications Commission, Federal Election Commission, Public 
Defender of the District of Columbia, Securities and Exchange Commission, Armed Forces Retirement 
Home, Broadcasting Board of Governors, Trade and Development Agency, American Battle Monuments 
Commission, International Assistance Programs, National Archives and Records Administration, 
Commodity Futures Trading Commission, Delta Regional Authority, National Credit Union 
Administration, Court Services and Offender Supervision Agency for the District, Neighborhood 
Reinvestment Cwporation, Appalachian Regional Commission, Denali Commission, and Smithsonian 
Institution, 
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Table 6. PART ratings according to agency 


Aaencv 

Results Not 
Demonstrated 

Ineffective 

Adequate 

Moderately 

Effective 

Effective 

Agriculture {70) 

19 

0 

19 

28 

4 


27% 

0% 

27% 

40% 

6% 

Commerce (28) 

5 

0 

8 

10 

5 


18% 

0% 

29% 

36% 

18% 

Defense (32) 

4 

0 

7 

10 

11 


13% 

0% 

22% 

31% 

34% 

Education (74) 

41 

6 

21 

4 

2 


55% 

8% 

28% 

5% 

3% 

Energy (50) 

4 

2 

7 

26 

11 


8% 

4% 

14% 

52% 

22% 

HHS (90) 

24 

4 

28 

24 

10 


27% 

4% 

31% 

27% 

11% 

DHS (45) 

17 

0 

10 

11 

7 


38% 

0% 

22% 

24% 

16% 

HUD (25) 

8 

4 

5 

7 

1 


32% 

16% 

20% 

28% 

4% 

DOJ (27) 

5 

.1 

12 

6 

3 


19% 

4% 

44% 

22% 

11% 

DDL (28) 

3 

4 

12 

8 

1 


11% 

14% 

43% 

29% 

4% 

State (40) 

3 

0 

9 

8 

20 


8% 

0% 

23% 

20% 

50% 

Interior (63) 

23 

0 

15 

20 

5 


37% 

0% 

24% 

32% 

8% 

Treasury (29) 

6 


6 

5 

11 


21% 

3% 

21% 

17% 

38% 

DOT (25) 

0 

1 

2 

17 

5 


0% 

4% 

8% 

68% 

20% 

VA(9) 

3 

0 

2 

4 

0 


33% 

0% 

22% 

44% 

0% 

EPA (43) 

3 

4 

28 

8 

0 


7% 

9% 

65% 

19% 

0% 

NASA (9) 

0 

0 

3 

4 

2 


0% 

0% 

33% 

44% 

22% 

NSF (10) 

0 

0 

0 

0 

10 


0% 

0% 

0% 

0% 

100% 

SBA (8) 

0 

0 

4 

3 

1 


0% 

0% 

50% 

38% 

13% 

SSA (2) 

0 

0 

0 

2 

0 


0% 

0% 

0% 

100% 

0% 

GSA{13) 

5 

0 

2 

4 

2 


38% 

0% 

15% 

31% 

15% 

NRC (5) 

0 

0 

0 

1 

4 


0% 

0% 

0% 

20% 

80% 

USAID (11) 

0 

0 

5 

5 

1 


0% 

0% 

45% 

45% 

9% 

0PM (6) 

0 

0 

4 

1 



0% 

0% 

67% 

17% 

17% 

USAGE (10) 

3 

0 

2 

5 

0 


30% 

0% 

20% 

50% 

0% 

OTHER (41) 

15 

1 

8 

10 

7 


37% 

2% 

20% 

24% 

17% 
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Ratings Agency 



Examining PART ratings by both agency and topic indicates that education programs 
tend to have a large number of programs that are either ineffective, or lacking in results. 
By contrast, foreign affairs and national security programs have a large number or 
percent of their programs rated effective or moderately effective. 

Once more, the Analytical Perspectives section of the budget indicates that some of this 
may be due to the fact that many of the largest education and HUD programs, in terms of 
funding, are grant programs. OMB’s analysis of grant programs shows that this type of 
program tends to lack in meaningful outcome data and has difficulty demonstrating 
results. 


7. Agency program ratings as a percent of agency FY 2005 appropriations 

What do these program ratings represent in terms of their proportion to the agency’s total 
annual appropriation? Table 7 shows the ratio of the total of all FY 2005 appropriations 
of PARTed programs (grouped by rating) w'itbin an agency to the agency’s total 
appropriations received, according to their FY 2005 financial statements. 
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Examining an agency’s performance by analyzing the number of programs receiving a 
particular rating does not necessarily tell us about the effectiveness of budgetary 
resources. To get a clearer picture of agency performance according to PART, we look at 
the percentage of agency budgets receiving a particular rating. For example, as mentioned 
earlier, 55% or 41 of the Department of Education’s programs are rated results not 
demonstrated. This represents 12% of the department’s funding. 

The Department of Homeland Security (DHS) and the Department of the Interior (DOI) 
both have relatively high percentages of their program appropriations rated results not 
demonstrated, 25% and 32% respectively. Veterans Affairs (VA) has 57% of its 
appropriations rated results not demonstrated. By contrast, the National Science 
Foundation (NSF) has 89% of its appropriations rated effective, corresponding to 100% 
of the ten programs PARTed in that agency to date. The Nuclear Regulatory Commission 
(NRC) also has a high percentage of its appropriations rated effective at 46%. Other high 
performers in terms of budget include the Department of Defense (DOD) with 29% of 
appropriations rated effective and NASA with 22%. 

HUD stands out from all agencies as having the highest percentage of its program 
appropriations rated ineffective at 22%. This is not surprising considering that two of the 
four programs receiving this rating comprise a large portion of HUD’s budget.''* 

Fifty percent of HHS’s budget is rated moderately effective due to the presence of the 
Medicare program in this ratings category. 


“ These four programs include the Community Development Block Grant (CDBG) program, funded at $5 
billion, HOPE IV, ($143 million), Project Based Rental Assistance ($4.95 billion), Rural Housing and 
Economic Development ($24 million). 
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Table 7. Percentage of agency funding levels according to ratings categorj^ 



Resutte Not 
Demonstrated 

ineffective 

Adequate 

Moderately 

Effective 

Effective 

Total 

Assessed as a 
percent of 

FY05 agency 
appropriations 

Total Agency 

FY05 

Appropriations 

Received 

($mil) 

Agriculture 

17% 

0% 

22% 

61% 

2% 

103% 

89998 

Commerce 

5% 

0% 

45% 

51% 

11% 

111% 

6897 

Defense 

3% 

0% 

9% 

11% 

29% 

53% 

298656 

Education 

12% 

3% 

58% 

5% 

0% 

78% 

56678 

Energy 

1% 

0% 

34% 

32% 

19% 

86% 

21249 

HHS 

2% 

0% 

2% 

50% 

4% 

59% 

438004 

DHS 

25% 

0% 

14% 

29% 

9% 

78% 

34786 

HUD 

17% 

22% 

1% 

39% 

3% 

82% 

35448 

DOJ 

7% 

0% 

33% 

24% 

4% 

68% 

16016 

DOL 

0% 

5% 

9% 

19% 

1% 

34% 

16378 

State 

5% 

0% 

24% 

15% 

46% 

90% 

12993 

interior 

32% 

0% 

18% 

12% 

2% 

64% 

9261 

Treasury 

7% 

0% 

14% 

16% 

8% 

44% 

15318 

DOT 

0% 

2% 

15% 

75% 

10% 

103% 

58618 

VA 

57% 

0% 

45% 

2% 

0% 

104% 

76380 

EPA 

1% 

4% 

52% 

7% 

0% 

63% 

6844 

NASA 

0% 

0% 

36% 

37% 

22% 

95% 

14903 

NSF 

0% 

0% 

0% 

0% 

89% 

89% 

4854 

S8A 

0% 

0% 

4% 

4% 

14% 

23% 

688 

SSA 

0% 

0% 

0% 

22% 

0% 

22% 

127272 

NRC 

0% 

0% 

0% 

40% 

48% 

88% 

569 

USAID 

0% 

0% 

38% 

25% 

0% 

63% 

4295 

OPM 

0% 

0% 

147% 

0% 

0% 

146% 

87998 

USAGE 







3982 

OTHER 







17807 

Total 







1471939 
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8. What percentage of the budget is represented by PART ratings? 

The total amount of money allotted to all of the 793 programs PARTed to date is $1.47 
trillion. This represents 64% of total outlays in FY 2005 (excluding interest on the 
debt).'^ Breaking this out by ratings category, 6% of FY 2005 outlays are rated results not 
demonstrate4 which amounts to $143 billion in FY 2005 appropriations. This may seem 
like a relatively small amount. However, some agencies have higher concentrations of 
results not demonstrated programs consuming a big part of some individual agency 
budgets as discussed in the previous section. 

As noted earlier, 22% of HUD’s appropriations for FY 2005 are rated ineffective or $9.5 
billion of its $41 billion budget. Though ineffective programs account for only 1% of the 
overall federal budget, this represents $1 8.6 billion of all federal spending in FY 2005. 


Percentage of FY05 Outlays by PART Rating 



” Note that the budget amounts given in the PART for individual programs do not represent budget 
authority or outlays but instead represent ‘landing levels’. This may include other kinds of spending such 
as fees and offsetting collections, therefore these figures are rough approximations. We take as our 
numerator the program budget figure or “funding level” reported in PART and calculate it as a percentage 
of the agency’s total budget authority' as reported in the agency’s annual financial statement. Due to this 
mismatch, some fractions may exceed 100%. 
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9. Mandatory vs. discretionary 

When we consider the budget in terms of mandatory, discretionary, and mixed spending, 
we are able to calculate the percentage of the budget that 0MB has PARTed. Using the 
data for the most recent available year, FY 2006, we find that 27% of mandatory 
spending is rated results not demonstrated, while 23% of discretionary spending falls into 
this category. Forty-three percent of mixed spending (programs that have both a 
mandatory and discretionary component)'® are rated results not demonstrated. Four 
percent of discretionary spending is ineffective, while 1% of mandatory spending is 
ineffective. The biggest mandatory program rated to date is Medicare, which is rated 
moderately effective and has a funding level of $407.2 billion in FY 2006. 


Chart 9. PART ratings by mandatory and discretionary funding 



Discretionary 


Mandatory 


Mixed 


□Effective 
SMod. Effective 
BAdequste 
a ineffective 
•RND 


This should not be confused with the designation of “mixed” under program category, which defines the 
mechanism (e.g., a loan or a grant) by which programs allocate money. 
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10. Presidential funding trends 

How has the president used PART in making FY 2007 budget decisions? By considering 
the difference between the president’s funding request for FY 2007 and what Congress 
appropriated in FY 2006 to the 793 programs PARTed to date, we see that there is a 
tendency for the president to recommend funding decreases for programs with ineffective 
ratings (75%), while recommending increases for a large percentage of effective 
programs (61%). The same percentage (42%) of programs rated results not demonstrated 
and adequate were recommended for funding decreases. A relatively large percentage of 
moderately effective programs, (56%) were recommended for funding increases. 

Chart 11. Difference between president’s FY07 request and FY06 actual 


Difference Between President's FY07 Funding Request and FY08 Appropriation 



RND ineffeoitve Aoequaie Mod. Effective Effective 
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Table 10. Difference between president’s FY07 request and FY06 actual 



RND 

Ineffective 

Adequate 

Mod. 

Effective 

Effective 

Increase 

50 

3 

80 

129 

76 


(26%) 

(11%) 

(37%) 

(56%) 

(61%) 

No Change 

60 

4 

47 

30 

13 


(31%) 

(14%) 

(21%) 

(13%) 

(10%) 

Decrease 

81 

21 

92 

72 

35 


(42%) 

(75%) 

(42%) 

(31%) 

(28%) 


1 1 . How did Congress appropriate money to PARTed programs (FY 05-FY 06)? 

Programs rated results not demonstrated and ineffective received fewer increases from 
Congress, 34% and 18%, respectively, than those rated adequate, moderately effective, 
and effective, while 59% of effective programs received increases in funding. 
Conversely, 42% of results not demonstrated programs and 79% of ineffective programs 
were given funding decreases. In the case of ineffective programs, the percent of 
programs recommended for funding decreases is slightly more than what was 
recommended by the president. We are not able to say if PART scores were used in 
making these decisions. Table 1 1 and Chart 1 1 illustrate the change in congressional 
appropriations between FY 05 and FY06 for PARTed programs. 


Chart 11. Difference between Congress FY06 and FY05 actual appropriation 


Difference Between FY06 and FY05 Enacted Appropriation 
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Change 
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Table 11. Difference between Congress FY06 and FY06 actual appropriation 



Results not 
Demonstrated 

Ineffective 

Adequate 

Mod. 

Effective 

Effective 

Increase 

64 

5 

104 

122 

73 


(34%) 

(18%) 

(47%> 

(53%) 

(59%) 

No 

Change 

47 

1 

29 

28 

6 


(25%) 

(4%) 

(13%) 

(12%) 

(5%) 

Decrease 

80 

22 

86 

81 

45 


(42%) 

(79%) 

(39%) 

(35%) 

(36%) 


12. The president’s Major Savings and Reforms report for FY 2007 

The FY 2007 budget marks the second year that the Bush Administration has issued its 
Major Savings and Reforms report.*’ This supplemental document to the president’s 
recommended budget contains all of the programs that the administration recommends 
for termination, reduction, or reform. This year the president is recommending the 
termination or reduction in funding for 141 programs, representing a potential $15 billion 
in savings. Of these programs, 91 are suggested for termination ($7.3 billion), and 50 
programs are recommended for reduction ($7.4 billion). Sixteen programs are 
recommended for reform. 

13. Ratings for PARTed programs selected for termination in FY07 

Of the 91 programs recommended for termination in the FY07 budget, 0MB has 
PARTed 32. 0MB rated 15 of the programs as results not demonstrated, seven as 
ineffective, eight as adequate, and two as moderately effective. 

In addition to poor PART scores, reasons for terminating programs include a lack of an 
appropriate federal role, the program completing its mission, overlap with existing 
programs, earmarking, and a change in budget priorities based on policy decisions. 

Appendix 1 located at the end of this paper includes a chart of all 141 programs and the 
reason given by the administration for its recommendation. 

Table 13. PART ratings and current funding levels for suggested terminations in the 
FY 2007 Budget 


{$ Mil) 

RNO 

Ineffective 

Adequate 

Mod. 

Effective 

Effective 

Terminations 

15 

7 

8 

2 

0 

Doilar 
amount 
proposed for 
termination 

-$2348 

-$1843 

-$419 

-$62 

$0 


” See, http://www.whitehQuse.gov/omb,4iudget/fv2007/odf/savings.pdf 
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14. Ratings for PARTed programs suggested for reductions in the FY07 Budget 

Of the 50 programs the administration recommended for reduced funding, 0MB has 
PARTed 14. Three are rated results not demonstrated and three more are rated 
ineffective. Six programs are rated adequate, and two are rated moderately effective. 


Table 14. Ratings for PARTed programs recommended for reduction in FY07 



Results Not 
Demonstrated 

Ineffective 

Adequate 

Moderately 

Effective 

Effective 

Reductions 

3 

3 

6 

2 

0 

Dollar amount 
proposed for 
Reduction 

-$620 

-$819 

-$1246 

-$101 



In addition to programs recommended for termination and reduction. President Bush has 
proposed 16 major reforms amounting to $5.7 billion reduced spending. These reforms 
include re-proposing the Strengthening America’s Communities Initiative. First 
introduced in the FY 2006 budget, the proposal would consolidate 17 existing commimity 
and economic development programs under one program in the Department of 
Commerce. 

15. What did Congress do in response to last year’s Major Savings and 
Reforms report? 

In FY 2006, the president recommended that 154 programs be terminated or allotted less 
funding. Congress accepted 89 of the president’s recommendations, in full or in part, for 
a total reduction in spending of $6.5 billion. 

Of the 99 programs recommended for termination last year. Congress terminated 24 of 
them and reduced funding for 28, yielding a total savings of $2.7 billion. 

Of the 55 programs proposed for reduction. Congress reduced funding for 37 programs, 
leading to a savings of $3.78 billion. 

16. Did PART play a role? 

Of these 154 programs recommended for termination or reduction for FY 2006, 0MB 
PARTed 54. Congress agreed to terminate or reduce funding for 21 of the 54 PARTed 
programs. Whether the PART evaluation played a role in Congress’s decision on these 
programs is not certain. Congress does not detail whether PART evaluations were 
considered in their decisions to terminate or reduce funding for these programs. Appendix 
2 provides a full listing of the programs and their associated Congressional action. 

It should be noted that Congress terminated or reduced funding for additional programs 
not included in the president’s recommendations. According to the U.S. House of 
Representatives Committee on Appropriations, Congress eliminated a total of 53 
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programs for a savings of $3.5 billion. Some of these (24) are in response to the 
presidenfs recommendations, while Congress eliminated the remainder at its own 
prerogative. These programs are also included in Appendix 2. 

This is an increase over previous years. In FY 2005, the president proposed terminating 
65 programs but Congress only adopted seven of these recommendations, reducing 
spending by $366 million. 


III. Conclusion 

The purpose of this study was to apply PART data in order to answer some basic 
questions about agency and budgetary performance. Overall, programs have moved from 
not having performance measures and data, to developing information to enable periodic 
evaluation of their performance. The number of programs rated results not demonstrated 
has decreased from 50% in FY 2004 to 24% in FY 2005. Though an improvement, this 
still represents 6% of federal outlays, meaning we do not have sufficient information to 
judge the performance of $143 billion of the federal budget. One percent of total outlays 
are rated ineffective representing $18.6 billion in spending in FY 2005. 

As last year, Department of Education programs continue to have the largest number of 
results not demonstrated (55%), representing 12% of its funding in FY 2005. The 
Department of Housing and Urban Development also has a large number of its programs 
rated ineffective, at 16%, representing 22% of its funding in FY 2005. This is due to the 
fact that two of its largest programs: the Community Development Block Grant program 
and Project-Based Rental Assistance, received $4.1 billion and $4.95 billion in funding in 
FY 2005, representing a large portion of HUD’s annual funding level. 

According to the president’s Major Savings and Reforms report, PART continues to 
inform some, but not all, Executive decisions in the proposed budget. Of the 141 
programs proposed for either termination or reduction in FY 2007, 46 have been 
PARTed. 

Calculating the difference between what the president proposed for funding in FY 2007 
with what Congress appropriated to the program in FY 2006, we find that 75% of 
programs rated ineffective are recommended for funding decreases, while 61% of 
programs rated effective are recommended for funding increases. There is not a perfect 
correlation however. Eleven percent of ineffective programs are recommended for 
increases, and 28% of effective programs are recommended for decreases. 

This mirrors congressional action. When we consider the difference between what 
Congress appropriated to programs in particular ratings categories in FY 2005 with what 
it appropriated to programs in those ratings categories in FY 2006 we find that 79% of 
programs rated ineffective were given funding decreases, while 59% of effective 
programs were given funding increases. Conversely, 18% of ineffective programs were 
given funding increases, while 36% of effective programs were given funding decreases. 
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In the case of ineffective programs, Congress gave funding decreases to more programs, 
than recommended by the president. We are not able to say if PART played a role in 
Congress’s decisions to terminate or reduce funding for programs. 

The Committee on Appropriations notes that, “the only way to establish accountability in 
the budget process is to stop spending on programs that have outlived their usefulness or 
could be delivered more effectively at the state or local level.” 

PART, it should be noted, is the Executive’s attempt to advance performance budgeting. 
Trying to link budgets with performance information is an idea that originated in 1994 
under GPRA. Though PART has advanced a particular method for evaluating 
government activity, using PART to make congressional decisions is not the goal, rather 
it is to encourage agencies to gather and report on program activity by establishing and 
using reliable outcome measures. This also means open and frequent dialog between 
program managers and Congress on the policy aims and intent of programs Congress has 
established to achieve its goals. Imparting increased transparency, and consistency, to the 
budget process means Congress and the Executive must systematically evaluate program 
activity and show' taxpayers how public benefits are being achieved by either funding or 
de-funding activities that Congress has deemed a federal responsibility. 

If Congress is to truly implement GPRA, i.e. to link budget and performance information 
in order to strategically allocate resources, it must first require reliable, consistent, 
performance information from agencies, and then it must use it, in conjunction with other 
information. This also means moving the appropriations debate from one of dollars spent 
to one of public benefits sought and achieved. 

PART’S methodology should continue to be subject to criticism and scrutiny, but this 
should not detract from PART’S main contribution, which is to forward performance 
budgeting within agencies, while bringing increased transparency and accountability to 
the budget process inside the Executive Branch. 
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Testimony to the Senate Homeland Security and Government Affairs 
Subcommittee on Federal Financial Management, Government Information, and 
Internationa] Security 

June 13, 2006 

Adam Hughes, MA 
Director of Federal Fiscal Policy 
OMB Watch 


Chairman Cobum, Ranking Member Carper, members of the subcommittee: My name is Adam 
Hughes and I am the Director of Federal Fiscal Policy at OMB Watch — an independent, 
nonpartisan watchdog organization. Thank you for inviting me to testify today on what we all can 
agree is a cmcial cause — making our government the most effective and responsive it can 
absolutely be. 

OMB Watch was founded in the early 1980s and has spent over twenty years advocating for 
government accountability, transparency and access to government information, and citizen 
participation in governmental processes. OMB Watch believes citizens must take an active role in 
holding their government accotmtable and that the federal government, when supported by sensible 
fiscal policy, can develop the programs and safeguards that meet the public’s needs. 

This issue has taken on added importance during the Bush administration as a combination of 
factors, some avoidable, some not, have plunged the federal government into debt. Large and 
sustained deficits over the past five years have made efficient use of government resources all the 
more important. In light of the anticipated budget crunch due to the baby boomers retirement over 
the coming decades, the fiscal situation of this country will only deteriorate further. Performance 
measurement can therefore become a particularly attractive alternative for those who want to set 
federal priorities based on the current fiscal prospects of a strained and shrinking revenue base (that 
is, without expanding that base to fund longstanding programmatic commitments), 

OMB Watch has been commenting on government performance issues for the better part of its 
existence. We have spent more time analyzing the Government Performance and Results Act 
(GPRA) and the PART over the last ten years as government itself has implemented multiple 
initiatives and mechanisms to attempt to gauge whether goals are being met. 

We are supportive of the concept of improving federal capacity to meet the public’s needs. OMB 
Watch has worked for over 20 years to protect and improve that capacity, and we have been open to 
possibility of using performance measurement as a means for achieving those ends. We bring a 
strong belief in the importance and potential of government itself to the work we do, and because of 
that belief, we want government to be responsive to community needs, spend money effectively, 
and accomplish it goals. We are advocates for government and therefore have a very strong self- 
interest in seeing government programs get results. 
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PART, however, is a very poor mechanism for measuring program performance and results, 
introducing biases and a skewed ideological perspective into a model claiming to present consistent 
and objective performance data and evaluations of government programs. Often times, the PART 
actually decreases the efficiency and effectiveness of government through increased administrative 
burdens, distracted managers, and compliance costs. 

Ironically, the PART mechanism itself does not produce the right type of results to further support 
and improve government. We believe PART ratings should not be directly connected with the 
budgeting process of Congress because of significant deficiencies — mainly the substantial biases 
and limitations embedded within the tool and the additional limitations we have observed in OMB’s 
actual application of PART. 

Based on our studies of PART and our longstanding commitment to an open, accountable 
government that is responsive to the public’s needs, I come to you today with three points to make: 

(1) PART continues a troubling trend we have seen in other executive branch initiatives 
and even congressional proposals — namely, a trend to arrogate increasing power to 
the White House, even in areas that by constitutional design have been committed to 
Congress. 

(2) PART is so limited and distorted a tool that it should be used neither for management 
nor for budget and appropriations decisions. Both by the design of the tool and as the 
mechanism is implemented, PART systematically ignores the reality of federal 
programs and judges them based on standards that are deeply incompatible with the 
purposes that federal programs are expected to serve. As one agency contact 
memorably explained to us, PART assessments are tantamount to a baseball coach 
walking to the mound to remove his pitcher and then chastising him for not kicking 
enough field goals as he brings in a reliever. 

(3) There is a better way. Specifically, Congress already has the means to investigate 
and produce far more sophisticated analyses of the usefulness, effectiveness, and 
results of government programs. In fact, this is one of the primary, if not the primary 
role of the legislative branch. While the oversight function of Congress may not be 
as robust as it once was due to significantly shorter legislative sessions and delays 
due to a sharply divided political climate, the capacity to judge the results of 
government programs already exists within the existing structures of Congress — 
structures that do not carry with them the significant limitations and negative 
consequences of the PART. 

I. PART: EXAMPLE OF BROADER SHIFTS IN POWER IN GOVERNMENT 

Before I discuss some of the specific weaknesses and negative consequences of the PART, I want 
to point out a larger trend in government over the last few years that we believe PART is connected 
to. Since the Bush administration came into office and after the terrorist attacks in September 2001, 
we have seen a steady shifting of power to the executive branch in many different facets of our 
government — particularly security and military policy. 
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Yet this larger trend toward increased executive power has spilled over into other areas outside 
security and defense. Some of the “budget process" changes currently being considered by 
Congress also have a tendency to consolidate yet more power in the White House. Specifically, the 
president's enhanced recession proposal scheduled for debate this month in Congress and a proposal 
to establish sunset commissions gaining traction in the House are indicative of this larger trend by 
allowing the President increased power over spending priorities and program authorizations — 
activities that are the proper domain of Congress These proposals represent a disturbing trend. 

In some ways PART is even worse than those proposals for two distinct reasons. First, PART is 
more insidious: whereas the other proposals openly seek to arrogate power to the White House, 
PART portrays itself as an unbiased evaluator of results and performance while serving the White 
House’s political priorities. As I will discuss today, PART is anything but an even-keeled evaluator 
of government programs. 

Second, the White House is using PART to supplant Congress’s rote and even to contravene long- 
settled Supreme Court precedent. By instituting the president's or OMB's subjective policy 
preferences and biases for those of the other branches of government, the PART is a seemingly 
innocuous tool for the executive to manipulate the balance of power across all of the federal 
government and remove some of the checks and balances that are an integral part of our 
representative political system. For this reason alone, PART should be approached extremely 
cautiously by those outside the administration. 

II. TWO TYPES OF BIASES LEAD TO FLAWED TOOL 

I would like to focus on two main aspects of the biases inherent in PART, I believe these biases are 
significant and numerous enough to discredit the PART from being heavily or directly involved 
with both budget requests and appropriations and also management of programs. 

Both by the design of the tool and as the mechanism is implemented, PART systematically ignores 
the reality of federal programs and judges them based on standards that are deeply incompatible 
with the purposes that federal programs are expected to serve. As one agency contact memorably 
explained to us, PART assessments are tantamount to a baseball coach walking to the mound to 
remove his pitcher and then chastising him for not kicking enough field goals as he brings in a new 
pitcher. 


A. STRUCTURAL BIASES EMBEDDED IN TOOL DESIGN 

1. Overly Simplistic Model Fails to Capture Diversity, Complexity, and Possibilities of the 
Federal Government 

The intricacies of the federal legislative process, the necessity of crafting coalitions to pass 
legislation, and the shifting face of congressional representation often lead Congress to create and 
later amend a wide diversity of federal programs with multiple, and at times conflicting, goals. The 
PART tool — because of its crude design and over-simplified rating system — is not robust enough 
to capture the complexity inherent in the federal government. 


PART Testimony: Adam Hughes, OMB Watch 


3 



92 


First and foremost, the black and white rating scale (ranging from effective to ineffective) ignores 
the multiple and diverging reason a program could be succeeding or failing. Different program have 
different problems for different reasons. Perhaps a program is struggling to achieve its mission 
because it is underfunded and an ineffective program deserves more resources. The PART ratings 
are unable to convey such complexity. 

The one-size-fits-all approach of the PART review process often minimizes or ignores important 
differences in purpose and design between varying types of government programs, possible 
intentionally overlapping goals between programs and departments, and even multiple goals 
Congress has charged a single program with achieving. 

Social problems are complex and diverse, and federal programs must accordingly take many shapes, 
attempt many approaches, and address a wide range of needs. The assumptions embedded in the 
very design of PART — that all that can be meaningfully known about programs is quantifiable; that 
programs have a single, unitary purpose that never adjusts to changing circumstances; that the only 
meaningful work performed by federal programs leads to a single outcome — are short-sighted 
assumptions tliat embody a narrow and simplistic vision of the role of government. It is simply too 
crude to serve as a useful guide for government management. 

Perhaps the most obvious failure of the tool in this regard is its narrow insistence on outcome 
measures as the benchmark of programmatic success. The outcome measurement straightjacket is 
problematic because it is inadequate to the task of informing the management of programs that can 
only be measured in terms of outputs or that are difficult to measure in terms of either outputs or 
outcomes. This blind adherence to outcome measures in the tool design fails to accommodate some 
very important types of programs, For example: 

• Multiple programs with varying approaches to the same problem, block grants, 
competitive grants, and demonstration grants are all ways to experiment with solutions 
to complex social problems. Grants to state and local governments, for example, 
attempt to take advantage of the fabled “laboratories of democracy” to experiment with 
ways to attack persistent and often intractable social problems. For some issues, such 
as foster care, Congress has decided that multiple programs in multiple agencies and 
departments — including the Title IV-E entitlement, the Adoption Assistance program, 
the Chafee Independence Living Program grants, Medicaid, special education services, 
and more — are needed to meet the needs of abused and neglected children, PART’S 
rigid criteria for uniqueness and unitary performance goals ignore the value of 
multiplicity and overlap and create perverse incentives to recentralize in the federal 
government what Congress has decided to shift to the states. 

• Research programs, such as the National Toxicology Program and the IRIS database, 
are intended to close gaps in our knowledge rather than lead to immediately 
measurable outcomes such as reduced incidence of cancer or decreases in lifetime 
fatality risks from exposure to toxic substances. In these cases, improvements in what 
we know and what we can reasonably determine are valuable in and of themselves, not 
because they lead to other measurable consequences. The PAlRT tool fails to recognize 
the value in pure research programs and the like; not only does PART therefore fail to 
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offer anything of value to the management of such programs, but it also threatens to 
lead to reduced funding and distorted priorities for no justifiable reason. 

. Research programs are the canary in the coalmine for another limitation of the tool: its 
bias for short-term impacts rather than long-term efforts. Every EPA research program 
PARTed as of the FY06 budget was assessed as “Results Not Demonstrated,” (RND) 
based on rationales that are deeply incompatible with the purposes of those programs. 
0MB criticized these programs for failing to link their research activities with the 
accomplishment of outcomes, but such criticism is willfully blind to the very nature 
and benefits of research: often we can leam as much from failure as from any success. 
This bias is built into the design of the tool itself, according to a member of EPA’s 
Science Advisory Board, who testified “it appears that the weighting formula in the 
PART favors programs with near-term benefits at the expense of programs with long- 
term benefits. Since research inevitably involves more long-term benefits and fewer 
short-term benefits, PART ratings serve to bias the decision-making process against 
programs such as STAR ecosystem research, global climate change research, and other 
important subjects.”' 

. Many programs are created to address concerns that are broader and deeper than 
PART, with its insistence on quantifiable outcome measures, can begin to 
accommodate. The Americorps National Civilian Conservation Corps, for example 
strives to achieve the goals of “strengthening communities” and “increasing civic 
responsibility.” It is not possible to establish quantifiable measures of community 
strength, but that impossibility does not mean that the communities themselves cannot 
attest to their strength. In such cases, the real measure of success will have to be 
subjective and narrative — and must include outside stakeholder input in order to 
balance competing perspectives and viewpoints, PART contains no avenues for 
stakeholder input into the program review process. Using PART as a management 
guide will threaten such programs and lead to a government that has no vision and fails 
to embody the public’s most cherished values. 

A management tool that disapproves of visionary, values-driven, future-oriented, or knowledge- 
creating programs is a tool for mismanagement, which would detract from what a federal 
government is uniquely situated to do. 

2. PART Creates Increased Management, Compliance, and Data Burdens 

Over the years since the PART was first introduced, the review process has often forced program 
managers and agencies to alter their existing management and performance review practices, 
institute new and costly data collection structures and systems, generate independent reviews and 
analyses from outside the government and overlay this performance initiative with previous 
government efforts. These alterations to program management have created an entire compliance 
system within itself that distracts energy and resources from achieving program goals. 


‘ Testimony of Dr. Genevieve Matanoski, EPA Science Advisory Board, before Subcomm. on Environment, 
Technology, and Standards, House Comm, on Science, Mar. II, 2004, available ar2004 WL 506081. 
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PART often conflicts or complicates other government wide reform initiatives. Collecting new 
types of data within agencies for 0MB in order to comply with the PART rating system is often 
constrained by the Paperwork Reduction Act, which requires agencies to reduce the number of data 
elements collected. Further, the PART and the Government Performance and Results Act (GPRA), 
which attempts to develop strategic goals and department and government cross-cutting comparison 
for the federal government through a much more open and accessible process than the PART 
mechanism, are often in conflict with each other, creating added management difficulties and 
increased compliance burden within agencies. 

Furthermore, there are significant obstacles to the data collection that PART demands. Agency data 
collection is constrained by the Paperwork Reduction Act, which requires agencies to obtain OMB 
approval before conducting any information collection that asks the same questions of ten or more 
people. Additionally, data collection efforts, especially the independent evaluations PART expects 
programs to rely on, can be expensive, but PART does not excuse programs that cannot collect the 
expected level of data because of a lack of funding. OMB itself is responsible for these obstacles, 
even as it penalizes programs for running into them. 

Between this Catch-22 and the sometimes absurd mismatch of PART measures and actual program 
purposes, program staff have learned to treat PART as a compliance exercise instead of a guide to 
better management. OMB Watch has conducted extensive, in-depth interviews with agency staff 
involved in PART assessments at the program level. We have heard repeatedly that agency staff 
have spent considerable time “gaming” the PART system — teaming the pressure points and pitfalls 
to avoid negative scores and consequences. A performance appraisal system so widely regarded as a 
mere compliance exercise offers little diagnostic benefit for agency program managers and is 
another indication PART scores should not be related to budget allocations for programs. 

3. part's Bias Toward Specific Program Types 

The extreme biases against block grant programs within the PART process are perhaps the most 
egregious and the most obvious example of the problems embedded in the very design of the tool. 

Programs that operate through grants, whether competitive grants or block grants, are rated lower 
on average than all other programs. When OMB rated block/formula grant programs (a category 
that includes both block grants and entitlements) in FY 2005 process, it found no block/formula 
grant programs were “effective” while finding 1 1 percent of programs in general were “effective.” 

In addition, OMB found 43 percent of block/formula grant programs to be ineffective while 
determining only 5 percent of programs overall were “ineffective.” 

The chart below compares the overall breakdown of PART scores in competitive grant programs, 
block grant programs, and all other programs after the reviews were completed for FY 2006. As is 
evident, grant programs rate significantly lower in PART reviews than all other programs on 
average. Further, of the programs rated “ineffective” that were zeroed out completely in the 
president's FY 2006 budget, 89 percent were competitive or block/formula grants. 


PART Testimony: Adam Hughes, OMB Watch 



95 


Comparison of Grant Programs and All Other Programs in PART 
(percentage of programs rated in each category) 


Competitive Grant 

Block Grant 

ill Other Programs 

Effective or 
Moderately 
Effective 

24% 

27% 

49% 

Adequate or 
Ineffective 

36% 

36% 

26% 

Results Not 
Demonstrated 

40% 

37% 

25% 


There is an easy explanation for this trend. Federal grant programs largely send money to the state 
and local governments, a system established intentionally by Congress because they have realized 
that in some instances it is vastly more efficient to allow individual states the flexibility to tailor 
their respective programs and initiatives to suit local and regional needs. Some, like the Community 
Development Block Grant, are particularly important to improving the economies of poor and rural 
communities across the country with locally designed projects and programs that address specific 
community needs. Entitlements, meanwhile, will always fail the PART’S demand for linkages 
between performance goals and revenue allocation, because entitlements are automatic distributions 
to entitled populations — in other words, PART scores them negatively for being exactly what 
Congress intended them to be. 

The odds are stacked against block/formula grants within the PART because performance review is 
an oversight mechanism, whereas the premise of block grants is that funds are sent to the states with 
certain freedoms from complex federal oversight requirements. Many states and local governments 
have their own performance and accountability review processes; overlaying federal PART reviews 
has the effect of overriding state and local government self-management, contrary to the intent of 
block grant projects. 


B. POLITICAL BIASES EMERGE THROUGH IMPLEMENTATION 

1. Inconsistencies with Presidentiai Budget Request Cast Doubt on Purpose of Tool 

A quick glance at PART ratings and budget requests should dissuade anyone from trying to find a 
logical or consistent pattern between them — there is no pattern. Even after reviewing almost every 
federal program and being used to develop multiple budget requests, it remains unclear if even the 
Bush administration uses the PART ratings to inform their budgeting decisions at the start of each 
year. 
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OMB Watch has conducted analyses of the list of programs highlighted by the President in each of 
the last two State of the Unions for not achieving the required results, as well as the broader list of 
programs reviewed under the PART and found some puzzling results. A few examples: 

. Of the 85 programs receiving a top PART score in 2006, the president proposed 
cutting the budgets of more than 38 percent, including the National Center for 
Education Statistics and a land grant program run by the Tennessee Valley Authority. 

. Of the programs rated “ineffective,” in the 2006 budget that were targeted for 

elimination, more than 78 percent came from the Departments of Housing and Urban 
Development or the Department of Education. 

. The Substance Abuse Prevention Block Grant, a program that provides grants to state 
to address addiction problems, was given the lowest possible rating of “ineffective” 
but received no reduction in funding. Moreover, the Earned Income Tax Credit 
Compliance Program — which targets lower income working Americans who have 
claimed the EITC and double checks their eligibility for the credit — was rated 
ineffective, yet received a substantial funding increase. 

The examples above are not used to cherry pick arbitrary cases, but underscore a larger pattern of 
inconsistency. Most troubling, in each of the last two years, of those programs singled out by the 
president for failing to produce results, more than two-thirds had yet to be reviewed by the 
PART questionnaire. In many more cases than not, it is unclear what kinds of determinations, if 
any, the president used to identify these supposed failing programs when the White House budget 
staff has not even used their own performance review tool to assess them. 

While other analysts have criticized the failure of the PART to establish a toehold in the budget 
formulation process, we believe these facts point to a larger problem that underscores the need for 
Congress to be highly dubious of the usefulness of using PART scores to inform budget decisions. 
The lack of consistency among ratings and the president's own budget requests points to the 
possibility that the PART is merely a rhetorical tool to support pre-ordained political conclusions. 

2. PART Sends Management Signals that Would Distorts Federal Priorities 

OMB uses PART to alter the management of federal programs in troubling ways. The PART 
mechanism allows for OMB perspectives and policy preferences to be inserted into the oversight 
and management structures of federal programs without congressional approval. Agency staff 
implementing federal programs are subordinate to OMB within the construction of the survey 
answers in PART, and experience concrete consequences — such as flat or decreased budget 
requests and, if the administration is successful with pay-for-performance proposals, even the 
inability to receive an annual salary increase — if they fail to heed the management signals OMB 
sends through PART. As a result, PART has enormous potential to distort federal priorities in ways 
that Congress has never permitted. 

OMB is, unfortunately, taking advantage of that potential. Many of the stated reasons for scoring 
programs negatively reflect nothing more than OMB’s disagreement with the way Congress 
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designed a program by law, OMB does not merely suggest to Congress ways a program can be, in 
its view, improved; instead, OMB scores a program negatively and imposes consequences against 
it, such as reduced budget requests, simply for following the law. OMB then justifies its decision 
using the rhetoric of results rather than a direct statement of its disagreement with Congress, Some 
examples: 

• The Consumer Product Safety Commission (CPSC), Occupational Safety and Health 
Administration (OSHA), and Mine Safety and Health Administration (MSHA) were all 
penalized for failing to use economic analysis in their rulemaking processes — even 
though they are forbidden by law and Supreme Court precedent from doing so. The CPSC 
is instructed by Congress not to use cost-benefit analysis when issuing rules specifically 
required by law, such as the rules governing garage door openers and bicycle helmets. 
CPSC (which, despite an otherwise high passing score, was categorized “Results Not 
Demonstrated”) was penalized for following the law and not conducting cost-benefit 
analyses for those rules. CPSC was also scored down for not complying with OMB’s 
demand for using net benefits as a criterion for regulatory decisions, even though CPSC’s 
authorizing legislation instructs the agency to take a different approach in order to 
maximize public safety. The same is true for OSHA and MSHA; OMB scored these 
programs negatively for failing to do “cost-benefit comparisons or monetizjing] human 
life,“ even though their organic acts and Supreme Court precedent forbid these practices. 

• OMB criticized the Appalachian Regional Commission (and flat-lined its budget request) 
in FY 2006 by claiming through the PART review that it was not a “unique” program, 
because other existing agencies provide the same services. OMB completely misses the 
point of the Appalachian Regional Commission, which Congress created precisely because 
the existing patchwork of programs was failing to meet the needs of the extraordinarily 
impoverished population of that region. 

. Another program serving rural populations, HHS’s Rural Health Activities program, was 
likewise penalized for following the very law that created it. OMB’s criticism from the 
PART review speaks for itself: ‘The major flaw of the Office’s portfolio stems from the 
programs ' authorization" (emphasis added). The program was targeted for a drastic cut 
(83 percent) in the president's budget this year. 

Interestingly enough, these examples are no longer necessary; in a recent hearing before this very 
subcommittee, an OMB official was asked point-blank whether it is possible for programs to 
receive low PART score simply because it follows the law, and OMB answered, simply, “Yes.”^ 

This distortion of priorities is also happening in a host of more subtle and indirect ways. Buried in 
the small type of the specific program assessments, the standards actually used to measure program 
“effectiveness” or “results” very often fail to focus on what is most meaningful or relevant about a 
program. One particular example is that the Clean Water Revolving Fund was given a low passing 
score and slated for deep budget cuts, in part because PART measured success based on the 
“percentage of water miles/acres with fish consumption advisories removed.” 


^ Add citation to Caiper/Johnson colloquy. 
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This measure is not a scientifically appropriate measure of actual water quality: as EPA recently 
announced, the number of rivers and lakes with mercury fish advisories increased in the last ten 
years even though the amount of mercury emissions actually declined by 100 tons? An increase in 
the number of advisories can actually be a sign of success, as it could mean the government is doing 
a better job of monitoring pollution and informing the public. 

These conflicts between the statutory mandates imposed by Congress and the willful arbitrariness of 
OMB are waived away when the assessments are offered to Congress, and the scores are attributed 
to the program’s “ineffectiveness” or failure to demonstrate results rather than OMB’s decision to 
measure programs with inapposite criteria or include subjective judgments about a program's worth. 

3. Grade Deflation Allows OMB to Manipulate Levers of Congressional Spending 

While OMB has gone to great lengths to advertise the PART as having an unprecedented level of 
transparency for the public by unearthing vast amounts of government information for the public. 
While it is certainly true OMB has marketed the PART to the public as an open govermnent 
initiative, the most crucial decisions, value judgments, and processes for arriving at the final product 
of a PART rating still remain largely hidden. These can often be the most important aspects of the 
entire process, masking a biased or manipulated product. 

One of the most glaring examples of this is the “Results Not Demonstrated” (RND) rating. It is not 
clear how OMB determines which programs should be shifted into the category of RND. OMB 
assigns weights to the scores fiom each of the four PART sections and then assesses those scores on 
a grading scale to determine whether a program passes (“Effective,” “Moderately Effective,” or 
“Adequate”) or fails (“Ineffective”). The category of “Results Not Demonstrated” is supposed to be 
reserved for programs that do not generate sufficient data or information upon which a passing or 
failing score can reasonably be assigned. Although explained in PART materials accordingly as an 
indeterminate category, programs relegated to the RNT) bin are often characterized in White House 
rhetoric as failing programs. 

Indeed this fact has been confirmed during our interviews with agency employees as all those 
interviewed told us the RND rating was the absolute worse one a program could receive under the 
PART — far worse than an “Ineffective” rating. 

The RND score is based on failure in a couple of specific questions. It is interesting to observe, 
however, that many of the programs scored RND otherwise score more highly in the section for 
producing results. In fact, 72 of the 178 programs (40 percent) categorized as “Results Not 
Demonstrated” by FY 2006 had scores that, according to OMB’s own grading scale, would have 
been granted passing scores if not for failure on the specific RND-determining questions. Of these 
72 programs, 12 should have received the high score of “Moderately Effective.” These 12 programs 
have higher scores for section 4 — the section that notionally measures actual results — than the 
average score for all programs actually rated “moderately effective.” Three programs — the 
Consumer Product Safety Commission, a USDA program for rural water treatment loans, and the 
National Credit Union Administration’s Community Development Revolving Loan Fund — 


^ Add citation to Aug 2004 announcement - this probably came from BNA 
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actually scored above 75 percent for producing results (substantially higher than the 60 percent 
average for all programs rated “Moderately Effective”). 

Moreover, the remaining 60 of the 72 otherwise passing programs would have received the middle 
passing score of “Adequate” — again, if not for failing the specific RND-deteimining questions. 
More than half of them (31/60) scored hi^er for section 4 than the average section 4 score for all 
programs that actually received the score of “Adequate.” Almost as many (24/60) had overall scores 
that bested the average overall score for programs that OMB allowed to receive the “Adequate” 
score. 

There is no explanation given for the weighting assigned to any of the particular questions or 
sections, nor for the absurd results once these weights are assigned. This inconsistency highlights an 
important point that emerged from our agency interviews as well. Implementation of the PART 
survey is highly dependent on the individual program officer at OMB, and working with different 
officers can not only completely alter the process by which the survey is completed but also the 
final rating for the program. The Government Accountability Office has also concluded that the 
PART gives a high level of influence to budget officers at OMB and leads to inconsistent 
application of the tool across the federal government."' 

III. THERE IS A BETTER WAY 

In much the same way other “budget process reform” proposals seek to increase the executive's 
control over federal revenues and spending priorities, PART also attempts to alter the balance of 
power within the federal government. The tool gives the executive a mechanism by which to 
impose its budgetary preferences, however political or biased, on Congress in a seemingly benign 
way by wrapping them in good government and results rhetoric, 

While the President is certainly free to classify federal programs in whatever way he believe is best 
and recommend those programs be supported with increased funding or eliminated according to his 
own preferences, it is disingenuous to attempt to pass off subjective and, at times, politically 
motivated policy conclusions as unbiased program reviews. 

There does seem to be some usefulness for the PART review process to serve as a diagnostic tool 
for program managers and agency employees. In particular, a process known as a PART cross-cut 
undertaken by OMB has shown significant promise as a model to improve efficiency of 
management and stewardship of specific programs across different agencies and departments. To 
our knowledge, this process was devoid of attempts to connect the results to significant alterations 
in budget priorities or alterations to the management agenda for implementing policy decisions. It 
true, I believe these are certainly aspects of the cross-cut that allowed it to be a productive exercise. 

In order for the variety of actors whose input is needed to make formulate budgeting decisions to 
use any type of performance review mechanism, it is crucial for those actors to believe the 
infonnation is credible and constitutes a consensus on objectives and goals. 


■' See Government Accountability Office, Performance Budgeting: Observations on the Use of OMB's Program 
Assessment Rating Tool for the Fiscal Year 2004 Budget, No. GAO-04-174 (Jan. 2004) 
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This has not been the case with the PART. Many individuals both inside and outside of Congress 
remain highly skeptical of this tool and the process by which the ratings are determined by OMB. 
Perhaps the biggest reason for this belief is because the PART is attempting to reinvent the wheel 
from a new perspective. Congress already has the structural and institutional capacity to develop a 
rigorous system of determining results and effectiveness of government programs through the 
appropriating and authorizing processes. The vast resources of the Congress available within the 
committee and personal staff structure as well as in the offices such as the Congressional Budget 
Office, Government Accountability Office, and Congressional Research Service are more than 
sufficient to provide far more robust information about program performance and results. 

Most importantly, relying too heavily on the PART ratings not only will gradually remove Congress 
from its funding and oversight responsibilities granted under the Constitution, but also will continue 
to close the door on opportunities for outside stakeholder interests to be infused into the 
congressional budgeting and evaluation process. This limited perspective on programs and goals is 
a crucial deficiency of the PART. By limiting the perspective of the reviews, the subjectivity and 
bias that will almost always creep into any time of rating does not have a counterbalance Ifom a 
wide range of outside stakeholder interests. 

While the expansion of the executive branch powers has been present in our government, 
particularly during times of war, since the turn of the last century, the overreach of those powers 
into areas historically and constitutionally given to Congress — the structuring of programs, 
appropriating and authorizing of revenues, and oversight of government — is a disturbing trend. 
Because of this, PART scores should be taken with more than just a grain of salt or even a hefty 
dose of skepticism by Congress. Unless the tool design and implementation system are significantly 
modified, they should probably be largely ignored. 

Thank you for the opportunity to share our views with you here today. I look forward to your 
questions. 
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ExpectMoreOfTheSame.gov 

The White House released its 2007 budget today, and budget director Joshua Bolten unveiled a new 
website - ExpectMore.gov — that “allows taxpayer to review the [White House] assessments of nearly 
800 federal programs.” “Here, you can see the exhaustive work that goes into each one of these 
assessments,” Bolten said at today’s press conference. “I expect that this website will be a useful tool for 
everyone who care about how tax dollars are spent” 

Their “ejdiaustive work” produced a delusion-riddled website that showcases the White House’s inability 
to assess its own problems and weaknesses. Katrina offers a real-world illustration of the new site’s 
inaccuracies: 

1 ) “ Federal Emergency Management Agency: Disaster Recovery ”: 

The Department of Homeland Security’s Recovery program ensures that individuals and 
communities affected by disastes [SIC] of all sizes, including catastrophic and terrorist events, are 
able to return to normal function with minimal suffering and disruption of services. 
PERFORMING: Adequate (one star) 

Reality — Reuters; 

With no clear recovery plan in sight five months after Hurricane Katrina, many victims are 
simply hanging on, waiting anxiously for signs that their neighborhoods are either reviving or 
turning into permanent ghost towns. 

2) “ Preparedness — Grants and Training Office National Exercise Program ”: 

Prepare Federal, state, and local responders to prevent, respond to, and recover from acts of 
terrorism by providing the tools to plan, conduct, and evaluate exercises. PERFORMING: 
Effective (three stars) 

Reality — GAO: 

Although the [National Response Plan] framework envisions a proactive national response in the 
event of a catastrophe, the nation does not yet have the types of detailed plans needed to better 
delineate capabilities that might he required and how such assistance will be provided and 
coordinated. 

3 ) “ Federal Emergency Management Agency: Dis ast er Res ponse’*: 


The Department of Homeland Security’s Response program is designed to quickly, efflcienth 
and effectively provide support to State. Tribal, and local governments, and Federal response 
teams in the event of a natural or manmade disaster, emergency or terrorist event 
PERFORMING: Adequate (one star) 

Reality — Washington Post: 

Four years after the Sept, i 1, 2001, attacks, ^ministration officials did not establish a clear chain 
of command for the domestic emergency; disregarded early warnings of a Categoiy 5 hurricane 
inundating New Orleans and southeast Louisiana; and did not ensure that cities and states had 
adequate plans and training before the Aug. 29 storm, according to the Government 
Accountability Office. 

Filed under: Katrina 


Posted by Pavson February 6, 2006 5:28 pm 
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2. Program Planning and Measurement 
a. Program Assessment 

Each year, the [Environmental Protection Agency’s Science Advisory] Board tries to 
evaluate EPA’s research priorities and their role in meeting the Agency’s goals. As part of the 
current review, the Board was given information resulting from the application of a new survey 
tool, the Program Assessment Rating Tool (PART) that was used to evaluate selected EPA 
programs. The Board is concerned that decisions are being made about research program funding 
on the basis of the application of this new tool. 

To be clear, the Board did not receive or review information on the rating instrument 
itself; however, after evaluating PART summaries for several research programs, our conclusion is 
that PART may, at this time, have a limited capacity to inform budget decisions on research 
programs. The Board is concerned with the manner in which the weighting formula in PART 
seems to influence the full analysis and thus favor programs with short-run results over those 
having long term results. There is also concern that an evaluator’s subjective considerations might 
be able to bias those weights and the rating itself. 

Specifically, it appears that the weighting formula in the PART favors programs with near- 
term benefits at the expense of programs with long-term benefits. Since research inevitably 
involves more long-term benefits and fewer short-term benefits, PART ratings serve to bias the 
decision-making process against programs such as STAR ecosystem research, global climate change 
research, and other important subjects. The PART seems to be intended as a formula for 
predictions about likely program success. However, the weights that the PART assigns to different 
program characteristics do not seem to have been validated systematically against the contribution 
of each program characteristic to any independent objective measure of program success. If the 
weights in the tool are arbitrarily assigned, the PART may have characteristics that could lead to 
biases in evaluation that are related to the subjective judgments of its designers. We believe that the 
tool should be reviewed to determine its adequacy for its use in supporting budget decisions. 

As the Board observed significant decreases in science and research funding, it also noted a 
substantial resource increase in the State and Tribal Assistance Grant account (STAG) for an 
initiative for retrofitting school busses. The Board does not challenge the worthiness of this 
program, rather it notes that it has no information on the science supporting this initiative. The 
Board trusts that the benefits of this program have been rigorously reviewed. 

The real issue here is how research programs (and others) are to be evaluated and whether 
a different metric is nece.ssary for basic vs. applied research programs. Also, of interest is whether 
research results should be evaluated separately from the outcomes of programs they are intended 
to support? Although the Board did not directly evaluate the PART itself, it is of obvious difficulty 
to conceive of a simple quantitative metric that could be applied across the broad areas of 
ecosystem quality, human health effects, endocrine effects, and technology development. The 
question is even more complex when you consider that some research is intended to develop 
limited data in the short-run to fill a specific knowledge gap and other research is intended to 
provide an understanding of whole systems in the long-term. Research program measurement is 
even more difficult because the knowledge and methods developed by EPA, especially ORD’s 
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researchers, are not usually directly applied by ORD, rather they are often used by others to 
support decisions on a broad suite of diverse statutory mandates. Thus, we believe that evaluations 
of the performance of research programs will need to consider the specific factors of each program 
that the research is intended to support. Further, it is unlikely that simple formulas will be able to 
handle this task well. It is more likely that realistic research program performance assessment will 
need to be a combination of quantitative metrics and other information and analyses which is then 
evaluated by groups of experts with relevant knowledge. 

I note that the NAS, in its review of STAR, also had concerns with quantitative routines 
used in performance assessments and noted that “The Committee judges that expert review by a 
group of people with appropriate expertise is the best method of evaluating broad research 
programs, such as the STAR program.” 

— Statement of Dr. Genevieve Matanoski, EPA Science Advisory Board, to Subcommittee on 
Environment, Technology, and Standards, House Committee on Science, March 11, 2004, 2004 
WL 506081 
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Introduction 


O n February 8, 2005, the Rockefeller Institute held a public policy forum on the state 
and local role in performance management in New York State. The forum was 
co-sponsored by the Rockefeller Institute, the New York State Division of the Budget, and 
the Manhattan Institute. This introduction is organized bottom-up, beginning with the lo- 
cal level and then discussing the state and federal levels. 

Speakers at the forum made me feel good. All six of the speakers presented construc- 
tive, upbeat reports on what they are doing. Their statements reflected a positive view of 
what can be done, and at the same time demonstrated a needed strong dose of realism on 
how hard it is to get good performance data that can influence state and local public man- 
agement 

The speakers to a person stressed using performance management systems to monitor 
and ratchet up performance to achieve clear goals on a timely basis — not annually, but 
much more regularly (preferable on a monthly basis) — with extensive interaction be- 
tween agency leaders and the managers of agency programs. 

In the Dali Forsythe edited volume published by the Rockefeller Institute Press on per- 
formance management,’ one of the major chapters on state and local performance man- 
agement (of which there are several in this volume) is on the CompStat performance 
management system in New York City for the New York Police Department Crime reduc- 
tion is the main goal Dennis Smith, who is a co-author with William Bratton of the chapter 
on CompStat, presented an update of this chapter and an appraisal of how other perfor- 
mance management systems, outgrowths of CompStat, are being implemented in New 
York City. 

Swati Desai moderated the panel and presented a talk on how the JobStat system in 
New York City works to monitor and manage the performance of the City’s 26 Job Centers 
for welfare and related human services. I have attended Thursday morning meetings on 
JobStat where the commissioner and his/her chief aides meet and interact with the heads 
of two of the City’s job centers. I was, and continue to be, impressed by this demonstration 
of performance management in action — where it matters most, at the front lines. 

Also at the afternoon session, Fred Wulczyn, a leader nationally on performance 
management for child welfare programs (foster care, adoptions, family preservation and 
abuse prevention), described his role in designing and helping to operate New York City ’s 
EQUIP system. This performance management system, which relies on techniques devel- 
oped at the University of Chicago, has had extensive practical application. Because it has 


I Quicker, Better, Cheaper?: Managing Performance in American Government, edited by Dali W. Forsythe. See 
Appendix A for the Table of Contents. Go to httD://www.roclcinst.org/Dtiblications/ripress boQks.html for 
more information. 
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been field-tested and operates with carefully scrubbed data, EQUIP is used for ranking 
and decision making about the sponsorship and funding of child welfare services. 

All three systems — CompStat, JobStat, and EQUIP — have developed over time and 
operate in real time. They are success stories where success is most critical 

The morning session on state-level performance management was organized by the 
New York State Division of the Budget. The first speaker was Chauncey Parker, Director 
of New York’s Division of Criminal Justice Services. He concentrated on New York City’s 
CompStat system, praising its architect. Jack Maple, and noting that he had attended up- 
wards of 1 50 CompStat review meetings. Parker stressed what he called “the three Ds ” — 
Definins goals clearly, having timely accurate Data, and holding people accountable in 
well-organized Deliberation processes. He described the CrimeStat system his office has 
established to partner with IS major urban counties in New York to create similar perfor- 
mance management systems, focused like CompStat on crime reduction. 

The second state-level speaker was Robert Fleury, First Deputy Commissioner of the 
Office of General Services, assisted by Rebecca Meyers. An important contribution Fleury 
made was to emphasize the way the mission of an agency affects its goals and management 
system. The Office of General Services, he said, is “a decidedly operational organization 
that builds, fixes, and maintains state facilities. ” Its performance management system is 
necessarily inward looking — a tool for agency management. 

In the discussion of Fleury ’s presentation, Edward Ingoldsby, Division of Budget 
Chief Budget Examiner, highlighted points brought out by Fleury. Ingoldsby noted that 
performance management works best “on an agency-by-agency basis where you have 
strong commissioner level support ” He added that it is difficult “to link performance 
management with the formal budgeting system. ’’ Performance management is not well 
suited as a tool for budgeting. Doing this, he said, can undermine its efficacy as a man- 
agement tool. 

A good example of how hard it is to avoid problems in performance management if 
the budgetary stakes are tempting was brought out by John Reed, New York State De- 
partment of Civil Service. He cited a mis-specified goal for the sanitation system in New 
York City, the amount of refuse collected. Reed said, “they discovered they were hosing 
down the truck to increase the weight people were delivering. ” 

The third speaker at the morning session was Andrew Eristoff, Commissioner of the 
New York State Department of Taxation and Finance. He previously served as both a City 
Council member and agency head at the local level in New York City, so he brought a 
multi-level perspective to the discussion. We use performance management “to manage 
our state-of-the-art taxpayer and collection call centers, to reduce waiting times, allocate 
resources, adjust hours, and match employee skills to caller issues. ’’ This, he said, is “em- 
bedded in our culture.” Eristoff described the agency’s “compliance continuum” and 
talked about the challenges involved in making such a system work well, which he said re- 
quires that it be “a continuing process. ” The latter point reflects an important generaliza- 
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tion — namely, that performance management has to be dynamic, with frequent 
adjustments of goals and measures to reflect changed conditions and policy preferences. 

In the question-and-answer sessions, there was discussion about how agency execu- 
tives can pull together and showcase performance management systems. The Mayor’s 
Management Report in New York City was discussed — how it has been slimmed down 
over time, how it is sometimes viewed too much as a political document, and the reasons 
why governments have to be careful not to “over-integrate” and over simplify performance 
management conceptually and operationally. 

Although it was not the subject of the forum, it is appropriate to add a discussion of the 
federal role in performance management For both the federal and state role, my view is 
that their role should be primarily a leadership, catalytic, and teaching role, except for 
agencies where the federal government or the state has operating responsibility. (In the 
Forsythe volume, the chapter on performance management by the Social Security Admin- 
istration is a demonstration of this point) 

Unfortunately, there is a strong tendency at the national level for the federal govern- 
ment to design and require the implementation of elaborate performance management 
systems that fail because they misunderstand the federalism terrain. Both the 1994 law 
passed by Congress, the Government Performance and Results Act (GPRA) and the re- 
sults measurement system adopted by the Bush Administration, focused on what are 
called Program Assessment Rating Tools (PART), have this problem.^ 

The Bush administration frequently stresses “results” in budget documents, using 
PART scores to justify budget changes, which in the current fiscal environment are mostly 
expenditure reductions. This is unfortunate. For one thing, it can cause the kind of gaming 
and distortions that undermine the idea of smarter, stronger, data-driven management to 
improve program performance. For another, it misses a critical point The fact that a pro- 
gram is underperforming doesn ’t mean its goals are unimportant Maybe, to the contrary, 
the purposes involved are so important that more money is needed along with better mana- 
gerial capability to carry them out Performance management is best suited, as its name in- 
dicates, to managing performance. It is strongest and most useful if carried out at the level 
of operational responsibility. 

When we decided to publish this report in hard copy, we asked all of the participants to 
work with us on editing their presentations and I thank them for doing so. Michael tooper. 
Director of Publications, supervised the preparation of this report; Irene Pavone in my of- 
fice worked with us to organize and review the material presented. I thank both of them for 
their help. 


Richard P. Nathan 


2 Richard P, Nathan “Presidential Address: ‘Complexifying’ Performance Oversight in America’s Govem- 

ments Journal of Policy Analysis and Management, vol. 24, no. 2 (2005); 207-2 1 5. See Appendix B. 
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Presidential Address: “Complexifying” Performance 
Oversight in America’s Governments 


Presidential Address: Richard p. Nathan 

"Complexifying” Performance 
Oversight in America's 
Governments 


In keeping with my locus as President of APPAM on the “M” in our name, this talk 
deals with ihe performance management movement in American government. 

Senator Daniel Palrick Moynthan, in a moment of frustration with me at a 1989 
Senate hearing on welfare reform, said I am a “complexifler.’' In this spirit, the way 
I see it is that the performance management movement in American government is 
on the right track, but that it oversimplifies. T want to seive today in the role of a 
"comploxifier" for the performance management movement. My aim is to be con- 
structive — to suggest ways in which efforts to improve government performance can 
be reconciled with the pluralistic setting of U.S. public management. 

Leaders in the federal government over the past 40 years have oversold simplistic 
systems for fulfilling public policy goals as expressed in the alphabet soup of sys- 
tems like PPBS, MBO, ZBB, NPR! and GPRA. 

The Alphabet Soup 

PPBS sland,s for Lyndon Johnsons Planning-Programming-Budgeting System, 
adopted with much fanfare and based on private industry and Defense Department 
systems to assess and compare public spending options. MBO was Nixon's succes- 
sor approach for Management by Objectives, ZBB was Carter's more radical initia- 
tive for zero-based budgeting to rank all spending option.s from the ground up in 
allocating government funds. NPR was Clinton's National Performance Review, 
which sought to I'octi.s government management and budgeting on achieving results. 
In 1993, a system for strategic results-based management and budgeting was 
enacted into law under ihe Government Performance and Results .Act (GPRA). 

For shock value, at the end of October of this presidential election year, i want to 
say something good about the newest alphabetically named performance manage- 
ment system. No mailer what happens in the presidential election nex't week, I sug- 
gest this slogan, "Lei's not part with PART," referring to the George W. Bush Admin- 
istration's "Program Assos.smcm Rating Tool" to compile elTectivcness ratings for all 
federal pi'ograrns. 

THE PART SYSTEM 

According to the FY 200.3 budget, the Administration ha.s "PARTci/” (lhi.s verb form 
is used frequently in ihe budget) 400 programs representing 40 percent of the 
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btidget. Two hundred programs will be added in FY 2006. These evaluations are 
based on 30 questions t<.i rale each program that is a.sse.ssed in four areas: 20 per- 
cent lor its purposes and design being clear. 10 percent for strategic planning, 20 
percent for program management, and 50 percent for results. The ultimate goal is 
to evaluate the performance of all federal programs (over 1 ,000) in this way, 

PART assessments draw one of five conclusions; effective, moderately effective, 
adequate, ineilective, or results not demonstrated. They are available online,' Tllus- 
trative ratings are for Head Start “results not demonstrated," for Medicare “moder- 
ately effective," and for the Community Development Block Grant (CDBG) "inef- 
fective.'' Reading PART reports (most oi them are about eight to ten pages long), 
one is struck both by their brevity and their variation. Some, for e.xampie on food 
stamps, are evenhanded and draw on a range of sources from within government 
and outside. Others are analytically weak. 

DISCUSSION 

The critical challenge for the PART .system and efforts like it i.s .setting performance 
goals. Where do goals come from? The answers vary. In some cases, it is wishful 
thinking in the form of political over-promising. In other cases, and tiie.se arc the 
ones 1 care about, performance goals arc based on research, policy analysis', and 
expertise. For this big category, 1 believe there should be broader consultation and 
collaboration involving policy officials,' program managers, policy analysts, aca- 
demic policy researchers, and public management experts. 

In situations in which the results of definitive social science experiments based on 
random-assignment studies can bo drawn upon, they materially aid policymakers 
by giving them a high level of confidence about impacts. Such studies, however, are 
not available, and indeed could not be conducted, for alt types of public programs and 
for alt types of affected groups and policy conditions and needs. 

Temiinology is important here. An impact means we can sliow Uiat a particular 
public program caused something to happen that would not have happened other- 
wise. An outcome is the word customarily used to express a program’s performance 
goals, regardless of whether we know if its activities are additive. Frequently, two 
typos ol people are involvcti: policy researchers, who generally favor exporimenia! 
studic,s of program impacts, and management experts, who focus on outcomes. The 
two groups often have different mindsets and skill sets. As I see it, public policy and 
public management researchers (e.specia!!y active members of APPAM among 
them) should contribute to knowledge about both impacts and outcomes. 

Leaders in the public seivice are called upon in many (indeed, most) .situations to 
.set and periodically adjust outcome goals for performance management based on 
the most pertinent public policy knowledge available, and also drawing on expert- 
i.se, experience, and observation. And yet even with tiie best of such efforts, govern- 
ments often do not deal wisely with anothci- critical challenge discussed in thi.s 
paper, the need to devise politically acceptable and workable performance goals 
that can't be gamed for undesirable puiposes. 

While I think the PART system is on the right track in focusing on individual pro- 
grams as the basic building blocks for assos.sing managerial performance, major 
problems are tliat it is loo centralized. Loo insular, and not sufficiently discriminat- 
ing, It does not adequately take into account the great differences that exist in the 


Sec? wwAv.whiEebouse.gov/tijiib. Accessed Septembet 7. 2004. See also Rodngiiez, 2004, pp. 56-61. 
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size, importance, operational chai-acter, and settings of different public programs, t 
have a particular /edera&m problem in this context. 

THE FEDERALISM CHALLENGE OF PERFORMANCE MANAGEMENT 

Most dome.stic programs of the federal government operate by indirection 
through state and local governments that in many cases contract with nonprofit 
and private corporations to provide public services. It is beyond the scope of this 
talk to survey the backward and forward bounces of decentralization in Ameri- 
can intergovernmental relations. Suffice to say, as Martha Dcrthick emphasizes, 
there are many ways in which members of the Congress and federal Executive 
Branch officials attempt to influence domestic affairs by adopting narrowly 
focused grant-in-aid programs and imposing and enforcing conditions, regula- 
tions, and guidelines on their operation. On the other hand, there is also a long 
histo)^ of devoliitionary efforts to broaden granLs {for example, by creating block 
grants), loosening or not enforcing regulations, granting waivers of federal 
requirements that are sought by state and local governments and interest groups, 
and generally by virtue of the fact that policymakers often disagree on public 
purposes and as a result adopt vague or even contradictory goals for domestic 
public policic-s.^ 

Wherever the devolution ball bounces, the point that .stand.s out for me is that 
stale and local governments have to be involved in assessing and improving pro- 
gram performance, especially under federal grani-in-aid programs that are broad 
gauged and have multiple purposes, as is so often the case. Federal agency offi- 
cials should work with stale officials in ways that are not heavy handed. They 
should adopt approaches that arc continuous, uscr-IViendly, candid, and appro- 
priately inlergovcrnmemal. 

This recommendation gels me into the consideration of differences between 
inputs, outputs, and intennediate and end outcomes as the goals of performance 
management.’ It i,s arguable as a general principle that under broad-gauged federal 
granLs-irt-aid, the federal government should care most about organizational out- 
puts, and in turn work with stales to stimulate them to assess and help them in the 
best ways they can to define and measure the outcomes for individual participants 
of these federally aided domestic programs in the different environmenls in which 
liiey operate. Thi.s is whal we actually do in more cases than is acknowledged. 

Governmental programs have literally thousands of iterations. Reconciling two 
values, the flexibility (which is inherent to the diverse governmental environment 
of American IcdcralLsmi with aceoimtabilitv (which policymakci's and adminis- 
trators should care aboitl and achieve), requires that the PART system be inter- 
governmenially sophisticated. But this is not the only way the performance man- 
agcmeni movement need.s to be complexificxl. 


- For visefui ensavs on rhis subject, see Detthick (200!) and also her other writings, 

' See the attached definitions by Hairy. 2001. p 19. This frart of ilic discussion coiicern.s "unils trl’anaiy- 
sis,” as well as measuixii goals. For some government programs, performance nmnagetneiU gtjuls involve 
indivuiuiiis as tho units of analysis (studenis, job seekers, sick people). For others, and for retisons ihat 
often reflect inhej erit liiniialions of rneiisurcmenl, we seUle foi (or may even decide we- pi’efer) of-gaiti' 
xuiioual gtiais as lire units o1 analysis: Did semee provider's do vvhal ihcy were .suijpo.sed to (for exam- 
ple, social security olTiccs, state food .stamp programs)? The subject of organixwtiona! performance is 
treated in Mathan (2000). chapters b, ^)0, which describe insfilulitinal evaluution.s of the implemcnia- 
lion of national welfare reform policies. 
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THE GAMING CHALLENGE OF PERFORMANCE MANAGEMENT 

The Office of Managemenl and Budget in its on-line documentation on the PART 
system says it is not primarily a budget tool. I agree with this idea. The PART sys- 
tem should not be sold solely as a method for deciding that such and such a pro- 
gram doesn't work so we should cut it, or that it doe.s work and we should add 
re.sources. That is not to say that weak programs should be retained or that they 
should receive more resources. There are situations in which performance findings 
should influence budgeting. But if PART is used mainly for this purpose, the likeli- 
hood is that it will lose its managerial efficacy. 

This possibility of gaming raises difficult questions for performance management 
involving how to set performance goals so that they influence agency behavior in 
the desired ways. Doing this requires both hard and soft accountability measures. 
In private industry, soft decision factors are often taken into account in rewarding 
managere. Economists have affirmed the wisdom of blending objective and subjec- 
tive performance measures in this way (Baker. Gibbons, and Murphy, 1994). Unfor- 
tunately, however, the preponderance of attention and literature on managerial 
oversight in government has focu.sed on rigid numeric goal setting. 

In this intense 2004 pre-election period, it is not easy to envision the kind of con- 
sultative and interactive goal setting that could wisely and in evenhanded ways 
bring qualitative and policy related decision faclors into play in performance over- 
sight. 'This is made harder right now by culs that have taken place in the manage- 
rial staffing of doine.stic programs, because overseeing such goals is necessarily 
labor inlen.sive. Moreover, countering the rise in rating-mania by bringing qualita- 
tive and political variables into play in performance oversight i,s hard to prescribe. 
Yet, the truth is that, like speaking prose, we do it all the lime, a.s suggesled by .some 
of the e.xamplcs dLscussed next, 

SOME EXAMPLES 

In the Held of employment and training, random-assignment demon.strations have 
shown that focusing on jobs is effective in aiding low-income, !ow-.skilIed people, 
especially women. However, performance-management goals focused on these 
dependent variables involving, for example, job placement and tenure can have 
problem.s. They can produce cream skimming, that i.s, selecting the most job-ready 
people to be aided, people who would have found jobs anyway. Corrections for this 
problem have been attempted, but can backfire. In one case I know of, the result 
was to undermine the public employment programs of the 1970s, leading to their 
eventual demise (Cook el ai.. 1985). 

In the Held of K-12 education, the einphasi,s on numeric goal achievement has 
caused unintended consequences that have required political jockeying and mana- 
gerial recalibration. To paraphrase Tip O'Neil, all education is local. No matter how 
elegantly tormed, the nature of the educational process requires that national over- 
sight goals for local schools be focu.scd on simple and general indicators, such that 
they tell only part of the story about school performance. For lhi,s reason it is impor- 
tant that performance goals for local schools be viewed as motivational eveiy bit as 
much as being viewed as tools for the close and fulsome calibration of institutional 
epfectivene.ss. 

In the field oi health policy, experts have urged the adoption of financial incen- 
tives for effective practice and quality enhancement, but at the .same time acknowl- 
edge that the processes for providing and overseeing such standards face impedi- 
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ments (Epstein el al., 2004). Furthermore, we are told that such goals are unlikely 
to achieve their intended results unless substantial amounts of new money are pro- 
vided (Epstein et al,. 2004). Where is the money going to come from? 

Citing these examples is not meant to suggest that setting performance goals and 
assessing how well they are achieved is unwise, Tlie point is that performance man- 
agement is difficult, can be expensive, and worst of all can backfire. It has to be 
smart and it has to be flexible, adaptive, and subtle. Stimulating efforts to ratchet 
up program performance on the part of the armies of goveminenla! and quasi-gov- 
ernmcntal workers al every level of government who implement public policies 
requires setting and treating performance goals so that they sert'c both as targets for 
managers and as symbols for the public, in the latter sense, symbols that are well 
and widely understood and accepted. Raising student performance by using per- 
formance goals to focus attention on shared values about what our schools .should 
(each has a good effect, and yet in the process of doing this we have learned about 
pitfalls of over-specitication and have had to make adjustments. This is as it should 
be; undue rigidity in setting performance goals can undermine their effectiveness. 

A good illustration of the latter point Is shown by the recent success of welfare 
reform in reducing dependency and facilitating employment. The 1996 national 
welfare reform law drarnalically changed bureaucratic behavior. Seemingly, this 
was rooted in strict and specific goals almut jobs, hours of work, etc. But in reality, 
there was a big loophole, the "caseload reduction credit." This provision of the law, 
plus others, enabled states to advance their work-first purposes in a manner that 
reflected varied state and local values and conditions. The process was incremental 
and typically American, The success achieved was not the result of crafty planning 
by calculating policymakers. It came about serendipitously in ways that were sur- 
prising to us in our implementation studies of the 1 996 national welfare reform law 
(Nathan and Gais, 1999; Gais ct al., 2001. pp, 35-69). 

MODIFYING THE PART SYSTEM 

To reiterate, what is appealing to me about the PART system is that it focuses on 
individual programs as the basic building blocks, more .so than on strategic and 
often overly elaborate purposes as advocated under previous federal govemment 
management reforms (Radin, 2000, pp. 1 1 1-135). What is needed, however, is a 
more candid and flexible treatment of goals in ways that involve a range of schol- 
arly and expert perspectives. 

Ann Blalock and Burl Barnow have called for a partnership for connecting 
demonstration and evaluation studies conducted by academic experts with per- 
formance management systems: 


Our I’ectjmmendalion is that competent e\*aiuation research, or applied social science 
rcseatvh, must be coordinated with or integrated within perfonnance mamigemcnt sys- 
tems if precise, valid, reliable infomnation about social programs is to be made available 
10 decisionmakers. (Blalock & Barnow, 2001, pp, 487-519) 


Blalock and Barnow point out ihal what they call the "evaluation movement'’ was 
developed in the crucible of academia, while the “performance management move- 
ment” has its roots in public administration and in administrative bureaucracies. 
They believe, and 1 agree, that collaboration between thc,sc two movements w'ould 
yield important benefits. 
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We recommend itmt the major direction for the future is to coordinate evaluation 
research with performaace managernem systems more fuliy, moving toward full inte- 
gration of evaluations within pcrfomtance management. Such integration will require 
that performance management systems treat evaluators not as aliens from outer .space, 
who land only periodically to study and give advice, but as part of an interdisciplinary 
team. It will require that evaluators become more senstliited to managers' needs, to have 
ongoing infonnation for tracking outcomes, and to express the benefits of their profe.s- 
sional roods with greater humility. (Blalock & Barnow, 2001, pp. 487-519) 


The two “movements" as described by Blalock and Barnow have different disci- 
plinary' base.s. The impact/experimental movement is essentially Weberian in its 
a.ssumptions about bureaucracy and implementation: Programs should be precisely 
controlled and replicated. By contrast, performance management seeks to establish 
goals as reference points for managers who are encouraged to innovate and change 
what they arc doing in response to continuous feedback. 

Just as there are many players in the bargaining processes for policymaking in 
American political pluralism, multiple players should be involved in deciding and 
adjusting the outcome goals for performance management. 

Who should be the multiple players in outcome goal setting? Program managers 
at the appropriate levels of government (federal, state, and local) bring a needed 
perspective to bear about the likely effects of different performance goals in differ- 
ent settings. Researchers and policy analysis can also play an important role draw- 
ing on research that shows what programs are likely to work best and what their 
effects are when they do. But, it isn't enough to know if a program works. The 
responsible goal-setting officials also need to work their way through the hard ques- 
tions raised earlier about how to express such findings in politically acceptable 
ways in performance goals that can’t be gamed for undesirable puiposes. Another 
important group of players is elected and appointed political officials; they clearly 
have a role to play in setting performance goals that reflect executive branch and 
legislative policies. 

There is still another ovcrarciiing aspect of this question about who should par- 
ticipate in peiTormanoe oversight involving auiptces. The GPRA law seeks the joint 
role of Congress and the Executive, whc:rca.s the Bush Administration's PART sys- 
tem gives the strong lead role to the Office of Management and Budget, While 
someone (that is, some agency) has to be in the lead, the challenge, as I see it, is not 
so much a challenge involving agency roles as one involving transparency. There 
needs to be a high level of transparency in sharing information about porformance 
oversight. This is needed in the case of the PART system as it applies to other con- 
trol agencies besides OMB, so that they can participate in performance oversight. 
The Congressional Budget Office, the Genera! Accounting Office, and the Congres- 
sional Research Service, along with slate and local budget and management offices 
and outside evaluators, need to have access to the underlying data that are used and 
understand how they are used in PART performance evaluations. 

In line with the “complexifying'’ theme of my talk, this challenge of sharing data 
is made more difficult as more data are brou^l to bear, and more data should he 
brought to bear in performance management. An obvious and important opportu- 
nity is the availability of administrative data. We are virtually drowning In admin- 
i,straiive data. Almost all domestic programs have reporting requirements that are 
extensive and detailed. Yel, despite this fact, and despite the fact that the quality of 
administrative data varies, not enough effort has been made to clean up these data, 
to scrutinize them and compile the best sources. Here, the underlying federalism 
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condition that states arc different — that is, laboratories not just of democracy, but 
also of data availability — creates an enviromnent in which multi-slate studies can 
take advantage of opportunities to use administrative data wisely and more exten- 
sively.'' Making fuller use of the best existing administrative data sets can have the 
important additional advantage of enabling evaluators to break out data for differ- 
ent groups of program participants in different kinds of program settings. 

Another step that I believe would advance the art and practice of performance 
managemcnl would be to relax a little, loosen up the rating games, instead of widely 
advertised U.S. News and World Repori-lypc -scorecards and report cards, people 
inside Americas governments would do well to present perfoiTnance results in more 
detailed and nuanced ways, explaining why and how they operate under different 
programs and conditions. 

As a political .scientist, 1 feel obliged al this juncture to go more deeply into my 
reason for putting so much emphasis in this talk on the dynami.sm, diversity, and 
fragmentation of the political milieu in which public services are provided, The rea- 
son is that .so often in government administration is policy. Policymaking and 
administration are intrinsically linked. The day-to-day promulgation of rules and 
guidelines; the issuance of jrolicy interpretations; the approval of grants; appoint- 
ments and staffing deci.sions, all involve the transmi.ssion of value.? in ways that are 
more than routinely ministerial (Nathan. 1983).® The public administration litera- 
ture is deficient in recognizing iJiat policies and programs are changing all the time. 
Performance management must take this dynamism into account. It has to be seen 
as a continuous proce.ss. It has to be canied out by trial and error. It cannot be 
accomplished by fixed "one-size-fils-all" managerial formulas. 

Thi.s is not to say that we should slojj working hard on performance rating, only 
that these ratings sliould be presented in thorough and nuanced ways, and that they 
should be explained in statements that describe the strengths and limitations of the 
technology and data on which they arc based. 

It was not my intention in lhi.s talk to make specific recommendations for modi- 
(ying the PART system. Moreover, my sen.se is that this should not be done in a rigid 
and fonnal way. Discussions about who is consulted and what kinds of inputs are 
used in setting performance are necessarily situational.'’ 

Opening up the PART system (and .stale and local PART-like performance man- 
agemcnl .sy.stcms) to more players and greater variation,s can also achieve some- 
thing that is very practical and low lech. It can help deal with the information explo- 
sion in public affairs. Wc are virtually Hooded with reports, e-mail mes,sages, took?, 
ariicle-s, conferences, and harangues. Performance management system.? like PART, 
which pull together what is known about the effects of public programs, can infomi 
and clarify debates about what America’s governments do and the immense, 
diverse, and contpics way.s their actions affect social and economic conditions. 

I realize that this plea for nuanced discourse, greater selectivity, nexibility, and 
transparency is wishlul. "Complexifying" performance managemenl in these ways 


■* Some large cities, notably New York Cil\v have inno\»afiw perfomiance management svsleirss, which 
can provitk valuable lessons and insights. 

^ On managemenl track-ing. 1 suggest Heinrich, 2002, pp. 712-725. 

"The best book that shows how public management \s situalional is Binvaucracy by James Q. Wilson 
(2000 edition), Wilson describes oj'ganizations a.s: needing "to acquire sufficient freedom of action." 
Based on his teaching and research, Wilson said he discovered that what w^as most needed wa.s lo per- 
mit an oi^anization "to define its ia.sks as it saw best and lo infuse the definition wiih a sense of mis- 
sion." See chapter 2, ‘'Oi^anizaiion Maitcjis." pp. 14-28. 
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requires the application of judgment, and this invariably opens opportunities for 
political maneuver. Still, the fact of the matter is that wc can never depoiiticize per- 
formance management, nor should we even think of trying to do so. In the final 
analysis, what is needed is to faring more players into the process in ways that cre- 
ate a wise balance of expertise tmd poEtics. The long-term result would be to create 
a setting in which there is a more realistic focus on progi-am results that gives cili- 
zen.s an honest, believable, and realistic portrayal of what their governments do and 
how well they do it. 

! consuited with many experts with varied perspectives and Experiences in writing this paper. 
Suggestions and comments were received from Oavid Baiducchi, Burt Barnow, Douglas 
BeshaTOv, Patricia Biilen, Jonathan Breul, Martha Derthick, Swati DeSai, Erik Devereux, 
Thomas Gais, William Grinker, .Tudy Gueran, Hairy Hairy, Carolyn Heinrich, Michael Lipsky, 
Irene Lurie, Lawrence Lynn, Gerald Marsehke, Lawrence Mead, Mark Nadel, Sonia Ospina, 
Beryl Ratlin, Justine Rodriguez, Frank Thompson, and Barry White, hene Pavone worked 
ably and patiently with me in thi,s consultation process. In the usual way, the ideas and points 
made in this talk tu'e my responsibility alone. 

RICHARD P. NATHAN, director, Nelson A. Rockefeller Institute of Government, State 
University of New York, 


ATTACHMENT 

Perfoiroance Management Definitions 


* Inputs: Resources (that i.s, expenditures or employee lime) used to produce 
outputs or outcomes. 

* Outpuls; Products and seiwices delivered. Output refers to the completed 
products of interna! activity: the amount of work done within the organiza- 
tion or by its contractors (such as a nuinbei' of miles of road repaired or num- 
ber of calls answered). 

* Intermediate Outcomes: An outcome that is e.xpected to lead to a desired end 
but is not an end in itself (such as .service response time, which is of concern 
to the customer making a call but does not tell anything directly about the 
success of the call). A service may have multiple intermediate outcomes. 

* End Outcomes; The end result that is sought (such as the community having 
clean streets or reduced incidence of crimes or fires). A service may have 
more than one end outcome. 

* Efficiency, or Unit-Cost Ratio: The relationship between the amount of input 
(usually dollars or employee-years) and the amount of output or outcome of 
an activity or program. If the indicator uses outputs and not outcome.s, a 
jurisdiction that lowers unit cost may achieve a measured increase in effi- 
ciency at the expense of the outcome of the service. 

* Performance Indicator: A specific numerical measurement for each aspect of 
performance (for example, output or outcome) under consideration. 


Source: Hatiy, 1997, pp. 1-4. 
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Highlights of GAO-04-1 74, a report to 
CQngressior>al requesters 
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Assessment Rating Tool (PART) is 
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many of our recommendations. 
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on performance goals and priorities 
for key programs. 
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What GAO Found 

PART helped structure OMB’s use of performance information for its 
internal program and budget analysis, made the use of this information more 
transparent, and stimulated agency interest in budget and performance 
integration. OMB and agency staff said this helped OMB staff with varying 
levels of experience focus on similar issues. 

Our analysis confirmed that one of PART'S m^or impacts was its ability to 
highlight OMB’s recommended changes in program management and design. 
Much of PARTS potential value lies in the related program 
recommendations, but realizing these benefits requires sustained attention 
to implementation and oversight to determine if desired results are achieved. 
OMB needs to be cognizant of this as it considers capacity and workload 
issues in PART. 

There are inherent challenges in assigning a single rating to programs having 
multiple purposes and goals. OMB devoted considerable effort to promoting 
consistent ratings, but challenges remain in addressing inconsistencies 
among OMB staff, such as interpreting PART guidance and defining 
acceptable measures. Limited credible evidence on results also constrained 
OMB’s ability to rate program effectiveness, as evidenced by the almost 50 
percent of programs rated “results not demonstrated." 

PART is not well integrated with GPRA— the current statutory framework 
for strategic planning and reporting. By using the PART process to review 
and sometimes replace GPRA goals and measures, OMB is substituting its 
judgment for a wide range of stakeholder interests. The PART/GPRA tension 
was further highlighted by challenges in defining a unit of analysis useftU for 
both program-level budget analysis and agency planning purposes. Although 
PART can stimulate discussion on program-specific measurement issues, it 
cannot substitute for GPRA’s focus on thematic goals and department- and 
govemmentwide crosscutting comparisons. Moreover, PART does not 
currently evaluate similar programs together to facilitate trade-offs or make 
relative comparisons. 

PART clearly must serve the President’s interests. However, the many actors 
whose input is critical to decisions will not likely use performance 
information unless they feel it is credible and reflects a consensus on goals. 
It will be important for OMB to discuss timely with Congress the focus of 
PART assessments and clarify the results and limitations of PART and the 
underlying performance information. A more systematic congressional 
approach to providing its perspective on performance issues and goals could 
facilitate OMB’s understanding of congressional priorities and thus increase 
part’s usefulness in budget deliberations. 


■United States Generd Accounting Office 
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United States General Accounting Office 
Washington, D.C. 20548 


January 30, 2004 

The Honorable George V. Voinovich 
Chairman 

Subcommittee on Oversight of Government Management, the Federal 
Workforce and the District of Columbia 
Committee on Governmental Affairs 
United States Senate 

The Honorable Todd R. Platts 
Chairman 

Subcommittee on Govenunent Efficiency and Financial Management 
Committee on Government Reform 
House of Representatives 

The Honorable Sam Brownback 
United States Senate 

The Honorable Todd Tiahrt 
House of Representatives 

Since the 1950s, the federal government has attempted several 
govemmentwide initiatives designed to better align spending decisions 
with expected performance — what is often commonly referred to as 
“performance budgeting." Consensus exists that prior efforts — including 
the Hoover Commission, the Planning-Programming-Budgeting-System 
(PPBS), Management by Objectives, and Zero-Based Budgeting (ZBB) — 
failed to significantly shift the focus of the federal budget process from its 
long-standing concentration on the items of government spending to the 
results of its programs. 

In the 1990s, Congress and the executive branch laid out a statutory and 
management framework that provides the foundation for strengthening 
government performance and accountability, with the Government 
Performance and Results Act of 1993’ (GPRA) as its centerpiece. GPRA is 
designed to inform congressional and executive decision making by 
providing objective information on the relative effectiveness and efficiency 
of federal programs and spending. A key purpose of the act is to create 
closer and clearer links between the process of 2 Jlocating scarce resources 


‘ Pub. L. No. 1(»^2 (1993). 
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and the expected results to be achieved with those resources. This type of 
integration is critical, as we have learned from prior initiatives that failed in 
part because they did not prove to be relevant to budget decision makers in 
tlie executive branch or Congress.^ GPRA requires not only a connection to 
the stnictures used in congressional budget presentations but also 
consultation between the executive and legislative branches on agency 
strategic plans, which gives Congress an oversight stake in GPRA’s 
success.® 

In its overall structure, focus, and approach GPRA incorporates two critical 
lessons learned from previous reforms. First, any approach designed to link 
plans and budgets — that is, to link the responsibility of the executive to 
define strategies and approaches with the legislative “power of the 
purse” — must explicitly involve both branches of our government. PPBS 
and ZBB failed in part because performance plans and measures were 
developed in isolation from congressional oversight and resource 
allocation processes. 

Second, the concept of performance budgeting has and likely will continue 
to evolve. Thus, no single definition of perfonnance budgeting 
encompasses the range of past and present needs and interests of federal 
decision makers. The need for multiple definitions reflects the differences 
in the roles various participjuits play in the budget process. And, given the 
complexity and breadth of the federal budget process, performance 
budgeting must encompass a variety of perspectives in its efforts to link 
resources with results. 

This administration has made the integration of performance and budget 
information one of five govemmentwide management priorities under its 
President’s Management Agenda (PMA).'* A central element in this initiative 
is the Office of Management and Budget’s (0MB) Program Assessment 
Rating Tool (PART) that 0MB describes as a diagnostic tool meant to 
provide a consistent ^proach to evaluating federal programs as part of the 


* U.S. General Accounting Office, Performance Budgeting: Past Initiatives Offer Insights 
for GPRA Implementation, GAO/AIMD-97-46 (Washington, D.C.; Mar. 27, 1997). 

*See Pub. L. No. 103-62, § 2 (1993), 6 U.S.C. § 306 (2003), and 31 U.S.C. §§ 1 1 15-1116 (2003). 

* In addition to budget and perfonnance integration, the other four priorities under the PMA 
are strate^c management of human capital, expanded electronic government, improved 
fin^claJ performance, and competitive sourcing. 
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executive budget formul^ion process. The PART is the latest iteration of 
50 yeare of federal performance budgeting initiatives. It applies 25 
questions to all “programs”® under four broad topics: (1) program purpose 
and design, (2) strategic planning, (3) program management, and (4) 
program results (i.e., whether a program is meeting its long-term and 
annual goals) as well as additional questions that are specific to one of 
seven mechanisms or approaches used to deliver the program.® 

To better understand the PARTs potential as a mechanism for assessing 
program goals and results, you asked us to examine (1) how the PART 
changed OMB’s decision-making process in developing the President's 
fiscal year 2004 budget request; (2) the PART’S relationship to the GPRA 
planning process and reporting requirements; and (3) the PARTs strengths 
and weaknesses as an evaluation tool, including how 0MB ensured that the 
PART was applied consistently. 

To respond to your request, we reviewed 0MB materials on the 
development and implementation of the PART as well as the results 
produced by the PART assessments. To assess consistency of the PART’S 
application, we performed analyses of 0MB data from the PART program 
summary and assessment worksheets for each of the 234 programs 0MB 
reviewed for fiscal year 2004, including a statistical analysis of the 
relationship between the PART scores and funding levels in the President's 
Budget We also identified several sets of similar programs that we 
examined more closely to determine if compzu'able or disparate criteria 
were applied in producing the PART results for these clusters of programs. 
We reviewed 28 programs in nine clusters covering food safety, water 
supply, military equipment procurement, provision of health care, 
statistical agencies, block grants to assist vulnerable populations, energy 
research programs, wildland fire management, and disability 
compensation. We also interviewed 0MB officials regarding their 
experiences with the PART in the fiscal year 2004 budget process. 


* There is no standard definition for the term "program." For purposes of PAHT, 0MB 
described the unit of analysis (program) as (1) an activity or set of activities clearly 
recognized as a program by the public, 0MB, and/or Congress; (2) having a discrete level of 
funding clearly associated with it; and (3) corresponding to the level at which budget 
decisions are made. 

® Tie seven nuyor categories are competitive grants, block/formula grants, capita! assets 
and service acquisition pro^ams, credit programs, regulatory-based programs, direct 
federal programs, and research and development programs. Thx programs were not 
addressed for the fiscal year 2(K)4 PART process. 


Page 3 


GAO-04-174 Performance Budgeting 




128 


As part of our examination of the usefulness of the PART as an evaluation 
tool and also to obtain agency perspectives on the relationship between 
PART and GPRA, we interviewed department and agency officials, 
including senior manners, and program, planning, and budget staffs at 
(1) the Department of Health and Human Services (HHS), (2) the 
Department of Energy (DOE), and (3) the Department of the Interior (DOI). 
We selected these three departments because they had a variety of program 
types (e.g., block/formula grants, competitive grante, direct federal, and 
research and development) that were subject to the PART and could 
provide a broad-based perspective on how the PART was applied to 
different programs. With the exception of our summary analyses of all 234 
programs, the information obtained from 0MB and agency officials and our 
review of selected programs is not generalizable to the PART process for all 
234 programs. However, the consistency and frequency with which similar 
issues were raised by 0MB and agency officials suggest that our review 
reliably captures several significant and salient aspects of the PART as a 
budget and evaluation tool. 

Our review focused on the fiscal year 2004 PART process. We conducted 
our work from May 2003 through October 2003 in accordance with 
generally accepted government auditing standards. Detailed information 
on our scope and methodology appeairs in appendix I. 0MB provided 
written comments on a draft of this report that are reprinted in appendix IV. 


ROSUltS in Briof part has helped to structure and discipline OMB’s use of performance 

information for its internal program analysis and budget review, made the 
use of this infonnation more transparent, and stimulated agency interest in 
budget and performance integration. Both OMB and agency staff noted that 
this helped ensure that OMB staff with varying levels of experience focused 
on the same issues, fostering a more disciplined approach to discussing 
program performance with agencies. Several agency officials also told us 
that the PART was a catalyst for bringing agency budget, planning, and 
program staff together since none could fully respond to the PART 
questionnaire alone. 

Our analysis confirmed that one of the PART’S major impacts was its ability 
to highlight OMB's recommended changes in program management and 
design. Over 80 percent of the recommendations made for the 234 
programs assessed for the fiscal year 2004 budget process were for 
improvements in program design, assessment, and program management; 
less than 20 percent were related to funding issues. As OMB and others 
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recognize, performance is not the only factor in funding decisions. 
Determining priorities — including funding priorities— is a function of 
competing values and inter^ts. Although 0MB generally proposed to 
increase funding for programs that received ratings of “effective" or 
“moderately effective" and proposed to cut funding for those progr^s that 
were rated “ineffective,” our review confirmed OMB’s statements that 
funding decisions were not applied mechanistically. That is, for some 
programs rated “effective” or “moderately effective” 0MB recommended 
funding decreases, while for several programs judged to be “ineffective” 
0MB recommended additional funding in the President’s budget request 
with which to implement ch^iges. 

Much of tile potential value of the PART lies in the related program 
recommendations and associated improvements, but realization of these 
benefits will require sustained attention to implementation and oversight in 
order to determine if the desired results are being achieved. Such attention 
and oversight takes time, and 0MB needs to be cognizant of this as it 
considers the capacity and workload issues in the PART. Currently 0MB 
plans to assess an additional 20 percent of all federal programs annually. 
Each year, the number of recommendations from previous years’ 
evaluations will grow — and a system for monitoring their implementation 
will become more critical. 0MB encouraged its Resource Management 
Offices (RMO) to consider many factors in selecting programs for the fiscal 
year 2004 PART assessments, such as continuing presidential initiatives 
and programs up for reauthorization. While all programs would eventually 
be reviewed over the 5-year period, selecting related programs for review 
in a given year would enable decision makers to analyze the relative 
efficacy of similar programs in meeting common or similar outcomes. We 
recommend that 0MB centrally monitor and report on agency 
implement^on and progress on PART recommendations to provide a 
govemmentwide picture of progress and a consolidated view of OMB’s 
workload in this area. In addition, to target scarce analytic resources and to 
focus decision makers’ attention on the most pressing policy issues, we 
recommend that 0MB reconsider plans for 1 00 percent coverage of federal 
programs by targeting PART assessments based on such factors as the 
relative priorities, costs, and risks associated with related clusters of 
programs and activities. We further recommend that 0MB select for review 
in the same year related or similar programs or activities to facilitate such 
comparisons and trade-offs. 

Developing a credible evidence-based rating tool to provide bottom-line 
ratings for programs was a ra^qor impetus in developing the PART. 
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However, inherent challenges exist in assigning a single “rating” to 
programs that often have multiple purposes and goals. Despite the 
considerable time and effort 0MB has devoted to promoting consistent 
application of the PART, the tool is a work in progress. Additional guidance 
and considerable revisions are needed to meet OMB’s goal of an objective, 
evidence-based assessment tool. In addition to difficulties with the tool 
iteelf— such as subjective terminology and a restrictive yes/no format — 
providing flexibility to assess multidimensional programs with multiple 
purposes and goals often implemented through multiple actors has led to a 
reliance on 0MB staff judgments to apply general principles to specific 
cases. 0MB staff were not fully consistent in interpreting the guidance for 
complex PART questions and in defining acceptable measures. In addition, 
the limited availability of credible evidence on program results also 
constrained 0MB staffs ability to use the PART to rate programs’ 
effectiveness. Almost 50 percent of the 234 programs assessed for fiscal 
year 2004 received a rating of “results not demonstrated” because 0MB 
decided that program performance information, performance measures, or 
both were insufficient or injulequate. 0MB, recognizing many of the 
limitations with the PART, modified the PART for fiscal year 2005 based on 
lessons learned during the fiscal year 2004 process, but issues remain. We 
therefore recommend that 0MB continue to improve the PART guidance by 
(1) clarifying when output versus outcome measures are acceptable and (2) 
better defining an “independent, quality evaluation." We further 
recommend that 0MB both clarify its expectations regarding the nature, 
timing, and amount of evaluation information it wants from agencies for 
the purposes of the PART and consider using internal agency evaluations 
as evidence on a case-by-case basis. 

The PART is not well integrated with GPRA — the current statutory 
framework for strategic planning and reporting. According to 0MB 
officials, GPRA plans were organized at too high a level to be meaningful 
for program-level budget decision making. To provide decision makers with 
program-specific, outcome-based performance data useful for executive 
budget formulation, 0MB has stated its intention to modify GPRA goals 
and measures with those developed under the PART. As a result, OMB’s 
judgment about appropriate goals and measures is substituted for GPRA 
judgments based on a community of stakeholder interests. Agency officials 
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we spoke wiUi expressed confusion about the relationship between GPRA 
requirements and the PART process. Many view PART’S program-by- 
program focus and the substitution of program measures as detrimental to 
their GPRA planning and reporting processes. OMB’s effort to influence 
program goals is further evident in recent 0MB Circular A-l 1 guidance^ 
that clearly requires each agency to submit a perform^ce budget for fiscal 
year 2005, which will replace the annual GPRA performance plan, 

The tension between PART and GPRA was further highlighted by the 
challenges in defining a unit of analysis that is useful both for program-level 
budget analysis and agency planning purposes. Although the PART reviews 
indicated to OMB that GPRA measures are often not sufficient to help it 
make Judgments about programs, the different units of analysis used in 
these two performance initiatives contributed to this outcome. For the 
PART, OMB created units of analysis that tied to discrete funding levels by 
both disaggregating and aggregating certain programs, In some cases, 
disaggregating programs for the PART reviews ignored the 
interdependency of programs by artificially isolating them from the larger 
contexts in which they operate. Conversely, in other cases in which OMB 
g^regated programs with diverse missions and outcomes for the PART 
reviews, it became difficult to settle on a single measure (or set of 
measures) that accurately captured the multiple missions of these diverse 
components. Both of these “unit of analysis" issues contributed to the lack 
of available planning and performance information. 

Although the PART can stimulate discussion on program-specific 
performance measurement issues, it is not a substitute for GPRA’s 
strategic, longer-term focus on thematic goals and department- and 
govemmentwide crosscutting comparisons, GPRA is a broad legislative 
framework that was designed to be consultative with Congress and other 
stakeholders and allows for varying uses of performance information, 
while the PART applies evaluation information to support decisions and 
program reviews during the executive budget formulation process. 
Moreover, GPRA can anchor the review of programs by providing an 
overall strategic context for programs' contributions toward agency goals. 
We therefore recommend that OMB seek to achieve the greatest benefit 
from both GPRA and PART by articulating and implementing an integrated, 
complementary relationship between the two, We further recommend that 
OMB continue to improve the PART guidance by expanding the discussion 


’ OMB Circular A-1 1, Preparation^ Submission, and Execution of the Budget, Section 220. 
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of how programs — also known as “units of analysis” — are determined, 
including recognizing the trade-offs, implications, or both of such 
determinations. 

As part of the President’s budget preparation, the PART clearly must serve 
the President’s interests. However, experience suggests that efforts to 
integrate budget and performance are promoted when Congress and other 
key stakeholders have confidence in the credibility of the analysis and the 
process used. It is unlikely that the broad range of players whose input is 
critical to decisions will use performance information unless they believe it 
is relevant, credible, reliable, and reflective of a consensus about 
perfonnance goals among a community of interested parties. Similarly, the 
measures used to demonstrate progress toward a goal, no matter how 
worthwhile, cannot appear to serve a single set of interests without 
potentially discouraging use of this information by others. We therefore 
recommend that 0MB attempt to build on the strengths of GPRA and PART 
by seeking to communicate early in the PART process with congressional 
appropriators and authorizers about what performance issues and 
information are most important to them in evaluating programs. 
Furthermore, while Congress has a number of opportunities to provide its 
perspective on performance issues and goals through its authorization, 
oversight, and appropriations processes, we suggest that Congress 
consider die need for a more structured approach for sharing with the 
executive branch its perspective on govemmentwide performance matters, 
including its views on perfonnance goals and outcomes for key programs 
and the oversight agenda. 

In commenting on a draft of this report, 0MB generally agreed with our 
findings, conclusions, and recommendations. 0MB outlined actions it is 
taking to address many of our recommendations, including refining the 
process for monitoring agencies’ progress in implementing the PART 
recommendations, seeking opportunities for dialogue with Congress on 
agencies’ performance, and continuing to improve executive branch 
implementation of GPRA plans and reports. 0MB also suggested some 
technical changes throughout the report that we have incorporated as 
appropriate. OMB’s comments appear in appendix W. We also received 
technical comments on excerpts of the draft provided to the Departments 
of the Interior, Energy, and Health and Human Services, which are 
incorporated as appropriate. 
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Backgroiand 


The current administration has taken several steps to strengthen and 
further performance-resource linkages for which GPRA laid the 
groundwork. Centra! to the budget and performance integration initiative, 
the PART is meant to strengthen the process for assessing the effectiveness 
of programs by making that process more robust, transparent, and 
systematic. As noted above, the PART is a series of diagnostic questions 
designed to provide a consistent approach to rating federal programs. (See 
app. II for a reproduction of the PART.) Drawing on available performance 
and evaluation information, the questionnaire attempts to determine the 
strengths and weaknesses of federal programs with a particular focus on 
individua] program results. The PART asks, for example, whether a 
program’s long-term goals are specific, ambitious, and focused on 
outcomes, and whether annual goals demonstrate progress toward 
achieving long-term goals. It is designed to be evidence based, drawing on a 
wide array of information, including authorizing legislation, GPRA strategic 
plans and performance plans and reports, financial statements, inspector 
general and GAO reports, and independent program evaluations. PART 
questions are divided into four sections; each section is given a specific 
weight in determining the final numerical rating for a program. Table 1 
shows an overview of the four PART sections and the weights 0MB 
assigned. 


Table 1: Overview of Sections of PART Questions 

Section 

Description 

Weight 

1, Program Purpose and 
Design 

To assess whether 

* the purpose is clear, and 

• the program design makes sense. 

20% 

II. Strategic Planning 

To assess whether the agency sets valid 
programmatic 

* annua! goals, and 

• long-term goals. 

10% 

Hi. Program Management 

To rate agency management of the program, 
including 

• financial oversight, and 

• program improvement efforts. 

20% 

IV. Program 
Resulls/Accounlability 

To rate program performance on goals reviewed in 

• the strategic planning section, and 

• through other evaluations. 

50% 


Source- GAOanalysboi the of AsUnAatfSfatos Oownmenf, Fi$sal yharZOOa, PerfomBnce and Managamenl Asseasmonfs 

<WssMt>g(ort, O.C.' February ?003) 
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In addition, ezw* PART program is assessed according to one of seven 
approaches to service delivery. Table 2 provides an overview of these 
program types and the number and percentage of programs covered by 
each type in the fiscal year 2004 President’s Budget performance 
assessments. 



Table 2: Overview of PART Program Types 



Number/percentage 

Program type 

Description 

of programs* 

1 . Direct federal 

Programs in which support and services are 

67 


provided primarily by federal employees. 

29% 

2. B!od</formula 

Programs that distribute funds to state, local, 

41 

grant 

and tribal governments and other entities by 
formula or block grant. 

18% 

3. Competitive 

Programs that distribute funds to state, local, 

37 

gran! 

and tribal governments, organizations, 
individuals, and other entities through a 
competitive process. 

16% 

4. Capital assets 

Programs in which the primary means to 

34 

and service 

achieve goals is the development and 

15% 

acquisition 

acquisition of capital assets (such as (and, 
structures, equipment, and intellectual 
properly) or the purchase of services (such 
as maintenance and information technology) 
from a commercial source. 


5. Research and 

Programs that focus on creating knowledge 

32 

development 

or applying it toward the creation of systems, 
devices, methods, materials, or technologies, 

14% 

6. Regulatory* 

Programs that employ regulatory action to 

IS 

based 

achieve program and agency goals through 
rule making that implements, interprets, or 

6% 


prescribes taw or policy, or describes 
procedure or practice requirements. These 
programs issue significant regulations, which 
are subject to 0MB review. 


7. Credit 

Programs that provide support through 

4 


loans, loan guarantees, and direct credit. 

2% 

8. Mixed'’ 

Programs that contain elements of different 

4 


program types. 

2% 


Source. QAO sonwwy anaenelysls o< ihe Srafes Qovernmenf. Fiscal i^reoo4. Performance am) Management 

Aaseaamenls (Washinglon. O C ' Februery 2003) 


‘Percenlages <3o nol add to 100 percent due to rounding. 

"OMB noted that in rare cases, drawing questions from two of the seven PART program types— that is, 
creation of a 'mixed” program type — yields a more informative assessment, 


Page 10 


GAO-04-174 Performance Budgeting 




135 


During the fiscal year 2004 budget cycle, 0MB applied the PART to 234 
progr^s (^out 20 percent of the fiscal year 2004 President’s Budget 
request to Congress*), and gave each program one of four overall ratings: 
(1) “effective," (2) “moderately effective," (3) “adequate,” or (4) 
“ineffective" based on program design, strategic planning, management, 
and results. A fifth rating, “results not demonstrated,” was given — 
independent of a program’s numerical score — if 0MB decided that a 
program’s performance information, performance measures, or both were 
insufficient or inadequate. The administration plans to assess an additional 
20 percent of the budget each year until the entire executive branch has 
been reviewed. For more information on the development of the PART, see 
appendix III. 


OMB Used the PART to 
Systematically Assess 
Program Performance 
and Make Results 
Known, but Follow-up 
on PART 

Recommendations Is 
Uncertain 


The PART clarified OMB’s use of performance information in its budget 
decision-making process and stimulated new interest in budget and 
performance integration. OMB generally proposed budget increases for 
programs that received ratings of “effective” or “moderately effective" and 
decreased funding requests for those programs that were rated 
“ineffective,” but there were clear exceptions. Moreover, the more 
important role of the PART was not in making resource decisions but in its 
support for recommendations to improve program design, assessment, and 
management. OMB’s ability to use the PART to identify and address future 
program improvements and measure progress — a m^or purpose of the 
PART — is predicated on its ability to oversee the implementation of PART 
recommendations. However, it is not clear that OMB has a centralized 
system to oversee the implementation of such recommendations or 
evaluate their effectiveness. 


The PART Made Budget and The part helped structure and discipline the use of performance 
Performance Integration at information in the budget process and made the use of such information 
OMB More Transparent transparent throughout the executive branch. According to OMB 

senior officials and many of the examiners and branch chiefs, the PART 
lent structure to a process that had previously been informal and gave OMB 
staff a systematic way of asking performance-related questions. Both 


" OMB defined 20 percent of the budget as either 20 percent of programs or their funding 
levels so long as all programs are assessed over the 5-year cycle for fiscal years 2004 through 
2008 budget requests. 
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agency and 0MB staff noted that this helped ensure that 0MB staff with 
varying levels of experience focused on the same issues, fostering a more 
disciplined approach to discussing performance within 0MB and with 
agencies. Agency officials told us that by encouraging more 
communication between departments and 0MB, the PART helps illuminate 
both how 0MB makes budget decisions and how 0MB staff think about 
program management. The PART also provided a framework for raising 
performance issues during the 0MB Director's Reviews. 0MB managers 
and staff reported that it led to richer discussions on what a program 
should be achieving, whether the program was performing effectively, and 
how program performance could be improved. 

Agencies also reported that the PART process expanded the dialogue 
between program, planning, and budget staffs, and stimulated interest in 
budget and performance mtegration. Several agency officials stated that 
the PART worksheets were a catalyst for bringing staffs together since 
none could fully respond to the questionnaire alone. 0MB and agency 
officials agreed that the PART led to more interactions between 0MB and 
agency program and planning staff and, in turn, increased program 
managers’ awareness of and mvolvement in the budget process. According 
to 0MB and several agency officials, the PART process — that is, responding 
to the PART questionnaire — involved staff outside of the performance 
management area Additionally, both agency and 0MB officials said that 
the attention given to programs that were not routinely reviewed was a 
positive benefit of the PART process. 


Use of Performance OMB senior officials told us that one of the PART’S most notable impacts 

Information Was Evident in ability to highlight OMB’s recommended changes in program 

OMB’s Recommendations management and design. As shown in figure 1, we found that 82 percent of 

PART recommendations addressed program assessment, design, and 
management issues; only 18 percent of the recommendations had a direct 
link to funding matters.* 


“ The 234 programs assessed for fiscal year 2004 contained a total of 6 12 recommendations. 
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Figure 1: Fiscal Year 2004 PART Recommendations 



The majority of recommendations relate to changes that go well beyond 
funding consideration for one budget cycle. For example, 0MB and HHS 
officials agree that the Foster Care program as it is currently designed does 
not provide appropriate incentives for the permanent placement of 
children; the program financially rewards states for keeping children in 
foster care instead of the original intent of providing temporary, safe, and 
appropriate homes for abused or neglected children until children can be 
returned to their families or other permanent arrangements can be made. 
The PART assessment provided support for OMB’s recommendation that 
legislation be introduced that would create an option for states to 
participate in an alternate financing program that would “better meet the 
needs of each state’s foster care population.” 

Performance information included in the PART for the Department of 
Labor’s (DOL) Community Service Employment for Older Americans 
program helped to shape OMB’s recommendation to increase competition 
for the grants. 0MB concluded that although the Older Americans Act of 
2000 amendments*® authorize competition for grants in cases in which 
grantees repeatedly fail to perform, the programs' 10 national grantees 


Pub. L. No. lOS^Ol (2000). 
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have historically been the sole recipients of grant hinds regardless of 
performance. 0MB recommended that DOL award national grants 
competitively to strengthen service delivery and open the door to new 
grantees. 

As 0MB and others recognize, performance is not the only factor in funding 
decisions. Determining priorities — including funding priorities — is a 
function of competing values and interests. As seen in figure 2, we found 
that PART scores were generally positively related to proposed binding 
changes in discretionary programs but not in a mechanistic way. In other 
words, PART scores did not automatically determine funding changes. 
0MB proposed funding increases for most of the programs rated “effective" 
or "moderately effective” and proposed funding decreases for most of the 
programs rated “ineffective," but there were clear exceptions. Programs 
rated as “results not demonstrated" — which reflected a range of PART 
scores — had mixed results. 


Rgure 2: Number of OlscreUonary PART Programs by Rating and Funding Result, Fiscal Years 2003*2004 


Program Percentage with Percentage with Percentage with 

rating increaae no change decrease 


EHactive 

n .10 


Moderately effective 
ns44 


Adequate 

n>29 


Ineffectira 

n -12 


Resutie not demonstrated 
n«< 0 i 




Bou’ce GAO anaiysis of OM6 <l8la 


Note: Discretionary programs refer to those programs with budgetary resources provided in 
appropriation acts. Because Congress controls spending for mandatory programs— generally 
entitlement programs such as food stamps, Medicare, and veterans' pensions — indirectly rather than 
directly through fhe appropriations process, we excluded them from our analysis. Of the 234 programs, 
we could not classify 1 1 as being either predominantly mandatory or discretionary: these programs are 
excluded from our analysis as well, and are listed In appendix I. 
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A large portion of the variability in proposed budget changes could not be 
expired by the quantitative measures reported by the PART. Regressions 
of PART scores never explained more than about 15 percent of the 
proposed budget changes. For only the one-third of discretionary programs 
with the smallest budgets, we found that the composite PART scores had a 
modest but statistically significant effect on proposed budget changes 
(measured in percentage change) between fisc^ years 2003 and 2004. For a 
ftiller discussion of the statistical methods used, see appendix I. 

The relationship between performance levels and budget decisions was not 
one-dimensional. For example, 0MB rated the Department of Defense’s 
Basic Research program as “effective,” but recommended a reduction in 
congressionally eamarked projects that it stated did not meet the 
program's merit review process. 0MB also recommended reducing funding 
for DOE’S International Nuclear Materials Protection and Cooperation 
program (rated “effective”) because difficulties in obtaining international 
agreements had resulted in the availability of sufficient unobligated 
balances*’ to make new funding unnecessary. However, 0MB sometimes 
proposed funding increases for programs that were rated “ineffective” to 
implement improvement plans that had been developed, such as the 
Internal Revenue Service’s new Earned Income Tax Credit compliance 
initiatives and DOE’s revised environmental cleanup plans for its 
Environmental Management (Cleanup) program. 


Capacity Issues Could 
Affect OMB’s Ability to Use 
the PART to Drive Program 
Improvements 


OMB has said that a major purpose of the PART is to focus on program 
improvements and measure progress, Effectively implementing PART 
recommendations aimed at program improvements will require sustained 
attention and sufficient oversight of agencies to ensure that the 
recommendations are producing desirable results. However, each year, the 
number of recommendations from previous years’ evaluations will grow. 
Currently, OMB plans to assess an additional 20 percent of all federal 
programs annually such that all programs would eventually be reviewed 
over a 5-year period. OMB encouraged its RMOs to consider a variety of 
factors in selecting programs for the fiscal year 2004 PART assessments, 
including continuing presidential initiatives and programs up for 
reauthorization. Strengthening the focus on selecting related programs for 
review in a given year would enable decision makers to analyze the relative 


" Unobligated balances are defined as portions of available budget authority that the agency 
has not set aside to cover current legal liabilities. 
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efficacy of similar programs in meeting common or similar outcomes. As 
our work has shown, unfocused and uncoordinated programs waste scarce 
ftinds, conftise and frustrate program customers, and limit overall program 
effectiveness. Therefore it is prudent to highlight crosscutting program 
efforts and cleariy relate and address the contributions of alternative 
federal strategies toward meeting similar goals. 

Although 0MB has created a template for agencies to report on the status 
of their recommendations and has reported that agencies are implementing 
their PART recommendations, 0MB has no central system for monitoring 
agency progress or evaluating the effectiveness of changes. While RMOs 
are responsible for overseeing agency progress, 0MB senior managers will 
not have a comprehensive govemmentwide picture of progress on the 
implementation of PART recommendations, nor will they have a complete 
picture of OMB’s workload in this area. As 0MB has recognized, following 
through on the recommendations is essential for improving program 
performance and ensuring accountability. 

Senior OMB managers readily recognized the increased workload the PART 
placed on examiners — in one public forum we attended, a senior OMB 
official described many examiners as being very concerned about the 
additional workload. However, OMB expects the workload to decline as 
OMB and agency staff become more familiar with the PART tool and 
process, and as issues with the timing of the PART reviews are resolved. 
Agency officials told us that originally, there was no formal guidance for 
reassessing PART programs — it varied by RMO. When issued, OMB’s 
formal PART guidance limited reassessments to (1) updating the 
status/implementation of recommendations from the fiscal year 2004 PART 
and (2) revisiting specific questions for which new evidence exists. OMB 
expected that in most reassessments, only those questions in which change 
could be demonstrated would be “reopened.” OMB officials acknowledged 
that this formal guidance is at least partly due to resource constraints. 

OMB staff were divided on whether the PART assessments made an 
appreciable difference in time spent on its budget review process. Many of 
those we spoke with told us that their workloads during the traditional 
budget season have always been heavy and that PART did not add 
significantly to their work, especially since the PART generally formalized a 
process already taking place. Those who did acknowledge workload 
concerns said that they were surprised at the amount of time it was taking 
to reassess programs. In fact, more than one OMB official told us that 
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reassessing programs was taking almost as long as brand-new assessments, 
despite the fact that 0MB scaled back the scope of these reassessments. 


Despite OMB’s 
Considerable Efforts to 
Create a Credible 
Evaluation Tool, PAET 
Assessments Require 
Judgment and Were 
Constrained by Data 
Limitations 


0MB went to great lengths to encourage consistent application of the PART 
in the evaluation of government programs, including pilot testing the 
instrument, issuing detailed guidance, and conducting consistency reviews. 
However, while the instrument can undoubtedly be improved, any tool that 
is sophi^cated enough to take into account the complexity of the U.S, 
government will always require 0MB staff to exercise interpretation and 
Judgment. Providing flexibility to assess multidimensional programs with 
multiple purposes and impacts has led to a reliance on 0MB staff 
judgments to apply general principles to specific cases. Accordingly, 0MB 
staff were not fully consistent in interpreting complex questions about 
agency goals and results. In addition, the limited availability of credible 
evidence on program results also constrained OMB’s ability to use the 
PART to rate programs’ eflectiveness. 


Inherent Performance 
Measurement Challenges 
Make It Difficult to 
Meaningfully Interpret a 
Bottom-Line Rating 


OMB published a single, bottom-line rating for the PART results as well as 
individual section scores, which are potentially more useful for identifying 
information gaps and program weaknesses. For example, one program that 
was rated “adequate” overall got high scores for purpose (80 percent) and 
planning (100 percent), but did poorly in being able to show results (39 
percent) and in program management (46 percent). Thus, the individual 
section ratings provided a better understanding of areas needing 
improvement than the overall rating alone. Bottom-line ratings inevitably 
force choices on what best exemplifies a program’s mission-even when a 
program has multiple goals — and encourages a determination of the 
effectiveness of the program even when performance data are unavaUable, 
the quality of those data is uneven, or they convey a mixed message on 
performance. 
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Many of the outcomes for which federal programs are responsible are part 
of a broader effort involving federal, state, local, nonprofit, and private 
partners. We have previously reported that it is often difficult to isolate a 
particular program’s contribution to an outcome and especially so when it 
involves third parties.’^ 'This was reinforced by the results of the fiscal year 
2004 PART reviews. One of the patterns that 0MB identified in its ratings 
was that grant programs received lower than average ratings. To 0MB this 
suggested the need for grejfter effort by agencies to make grantees 
accountjdile for achieving overall program results. However, grant 
structure and design play a role In how federal agencies are able to hold 
third parties responsible and complicate the process of identifying the 
individual contributions of a federal program with multiple partners. In 
particular, block grants present implementation challenges, especially in 
those instances in which national goals are not compatible with state and 
local priorities. 


0MB Employed Numerous 
Tools and Techniques to 
Promote and Improve 
Consistent Application of 
the PART 


OMB went to great lengths to encourage consistent application of the PART 
in the evaluation of government programs. These efforts included (1) 
testing the PART in selected agencies before use in the fiscal year 2004 
assessment, (2) issuing detailed guidance and worksheets for use by PART 
teams, (3) maldng the Performance Evaluation Team (PET) available to 
answer PART implementation questions, (4) establishing an Interagency 
Review Panel (IRP) to review consistency of PART evaluations, and (5) 
making improvements to the fiscal year 2005 process and guidance based 
upon the fiscal year 2004 experience, 


OMB conducted a pilot test of the PART and released a draft of the PART 
questionnaire for public comment prior to its use for the fiscal year 2004 
budget cycle. During Spring Review in 2002, OMB and agency staff piloted 
the draft PART on 67 programs. The PART was also shared with and 
commented on by the Performance Measurement Advisory Council and 
other external groups. According to OMB, the results of the Spring Review 
and feedback from external groups were used to revise the draft version of 
the PART to lessen subjectivity and increase the consistency of reviews. 


’’ See GAO-03-595T and U.S. General Accounting Office, Managing for Results: Efforts to 
Strengthen the Link Between Resources and Results at the Administration for Children 
and Famines, GAO-03'9 (Washington. D.C.; Dec. 10, 2002). 
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0MB issued detailed guidance to help 0MB and agency staff consistently 
apply the PART and created electronic “templates” or worksheets to aid in 
completing PART assessments. This guidance explains the purpose of each 
question and describes the evidence required to support a “yes” or “no” 
answer. In order to account for different types of programs, several 
questions tailored to the seven program types were added to the PART 
(primarily in Section in — Program Management). While the PART guidance 
carmot be expected to cover every situation, the instructions established 
general standards for PART evaluations. 

PET addressed in “real time” questions and issues that 0MB staff that were 
completing the PART evaluations repeatedly raised. PET consisted of 
examiners drawn from across the 0MB organization representing a variety 
of programmatic knowledge and experiences. It served as a sounding 
board for 0MB staff and a source for sharing experiences, issues, and 
useful approaches and also provided training to 0MB and agency staff on 
the process. For example, in one 0MB branch, staff were grappling with 
how to apply the PART to a set of block grants. They went through the 
instrument with the PET member from their RMO and continued to consult 
with that individual throughout the process. 

0MB also formed IRP, which consisted of both 0MB and agency officials, 
to conduct a consistency check of the PART reviews and to review formal 
appeals of the process or results for particular questions. During the fiscal 
year 2004 budget process, IRP conducted a consistency review of 10 
percent of the PART evaluations using a subset of the PART questions that 
0MB staff identified as being the most subjective or difficult to interpret. 
IRP also reviewed formal agency appeals to determine whether there was 
consistent treatment of similar situations. 


As an Evaluation Tool, the 
PART Has Weaknesses in Its 
Design and, as a Result, Its 
Implementation 


Despite the considerable time and effort 0MB has devoted to promoting 
consistent application of the PART, difficulties both with the tool itself 
(such as subjective terminology and a restrictive yes/no format) and with 
implementing the tool (including inconsistencies in defining acceptable 
measures and contradictory answers to "pairs” of related questions) 
aggravated the general performance measurement challenges described 
earlier. 
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Subjective Terms and a 
Restrictive Format 
Contributed to 
Subjective and 
Inconsistent 
Responses 


Many PART questions contain subjective terms that are open to 
interpretation. Examples include terminology such as “ambitious” in 
describing sought-after performance measures. Because the 
appropriateness of a performance measure depends on the program’s 
purpose, and because program purposes can vary immensely, an ambitious 
goal for one program might be unrealistic for a similar but more narrowly 
defined program. Some agency officials claimed that having multiple 
statutory goals disadvantaged their programs, Without further guidance, 
subjective terminology can influence program ratings by permitting 0MB 
staffs views about a program’s purpose to affect assessments of the 
program’s design and achievements. 


Althou^ 0MB employed a yes/no format for the PART because 0MB 
believes it aided standardization, the format resulted in oversimplified 
answers to some questions. 0MB received comments on the yes/no format 
in conducting the PART pilot. Some parties liked the certainty and forced 
choice of yes/no. Others felt the format did not adequately distinguish 
between the performance of various programs, especially in the results 
section (originally in the yes/no format), In response to these concerns, 
OMB revised the PART in the spring of 2002 to include four response 
choices in the results section (adding "small extent" and “large extent" to 
the original two choices “yes” and “no”), while retaining the dichotomous 
yes/no format in the other three sections. OMB acknowledged that a “yes" 
response should be definite and reflect a very high standard of 
performance, and that it would more likely be difficult to justify a “yes" 
answer than a “no" answer. Nonetheless, agency officials have commented 
that the yes/no format is a crude reflection of reality, in which progress in 
planning, management, or results is more likely to resemble a continuum 
than an on/off switch. 


Moreover, the yes/no format was particularly troublesome for questions 
containing multiple criteria for a “yes” answer. As discussed previously, we 
conducted an in-depth analysis of PART assessments for 28 related 
programs in nine clusters and compared the responses to related questions. 
That analysis showed six instances in which some OMB staff gave a “yes” 
answer for successfully achieving some but not all of the multiple criteria, 
while others gave a “no” answer when presented with a similar situation, 
For example. Section II, Question 1, aste, “Does the program have a limited 
number of specific, sunbitious, long-term performance goals that focus on 
outcomes and meaningfully reflect the purpose of the program?" The PART 
defines successful long-term goals by multiple, distinct characteristics 
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(program has long-term goals, time frames by which the goals are to be 
achieved^ etc.), but does not clarify whether a program can receive a “yes” 
if eocft. of the characteristics is met, or if most of the chm'acteristics are 
met. This contributed to a number of inconsistencies across program 
reviews. For example, 0MB judged DOI’s Water Reuse and Recycling 
program “no" on this question, noting that although DOI set a long-term 
goal of 500,000 acre-feet per year of reclaimed water, it failed to establish a 
time frame for when it would reach the target. However, 0MB judged the 
Department of Agriculture's and DOFs Wildland Fire programs “yes” on this 
question even though the programs’ long-term goals of improved 
conditions In high-priority forest acres are not accompanied by specific 
time frames. In another example, OMB accepted DOD's recently 
established long-term strategic goals for medical training and provision of 
health care even though it did not yet have measures or targets for those 
goals. By breaking out targets and ambitious time frames separately from 
the question of annual goals, agencies have an opportunity to get credit for 
progress made. 


There Were 
Inconsistencies in 
Defining Acceptable 
Measures and in 
Logically Responding 
to Question “Pairs” 


In particular, our analysis of the nine program clusters revealed three 
instances in which OMB staff inconsistently defined appropriate 
measures — outcome versus output — for programs. Officials also told us 
that OMB staff used different standards to define measures as outcome 
oriented. This may reflect, in part, the complexity of and relationship 
between expected program benefits. Outcomes are generally defined as the 
results of outputs — products and services — delivered by a program. But in 
some programs, long-term outcomes are expected to occur over time 
through multiple steps. In these cases, short-term outcomes — immediate 
changes in knowledge and awareness — might be expected to lead to 
intermediate outcomes — behavioral changes in the future — and eventually 
result in long-term outcomes — benefits to the public. 


In the employment and training area, OMB accepted short-term outcomes, 
such as obtaining high school diplomas or employment, as a proxy for long- 
term goals for the HHS Refugee Assistance program, which aims to help 
refugees attain economic self-sufficiency as soon as possible after they 
arrive. However, OMB did not accept the same employment rate measure 
as a proxy for long-tenn goals for the Department of Education’s Vocational 
Rehabilitation program because it had not set long-term targets beyond a 
couple of years. In other words, although neither program contained long- 
term outcomes, such as participants gaining economic self-sufficiency, 
OMB £u:cepted short-term outcomes in one instance but not the other. 
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Simiiarly, 0MB gave credit for output measures of claims processing (time, 
accuracy, and productivity) as a proxy for long-tenn goals for the Social 
Security Administration’s Disability Insurance program, but did not accept 
the same output measures for the Veterans Disability Compensation 
program. 0MB took steps to address this issue for fiscal year 2005. 

We also found that three “question pairs” on the PART worksheets are 
linked, yet in two of the three “pairs," a disconnect appeared in how 0MB 
staff responded to these questions for a given program.*^ For example, 29 of 
the 90 programs (32 percent) judged as lacking “independent and quality 
evaluations of sufficient scope conducted on a regular basis” (Section II, 
Question 5) were also judged as having “independent and quality 
evaluations that indicated the program is effective and achieving results” 
(Section IV, Question 5). There is a logical inconsistency in these two 
responses. In another instance, there was no linkage between the questions 
that examine whether a program has annual goals that demonstrate 
progress toward achieving long-term goals and whether the program 
actually achieves its annual goals. For example, 15 of the 75 programs (20 
percent) judged not to have adequate annual performance goals (Section H, 
Question 2) were nevertheless credited for having made progress on their 
annual performance goals (Section IV, Question 2). However, the guidance 
for the latter question clearly indicates that a program must receive a “no” 
if it received a “no” on the existence of annual goals (Section 11, Question 
2). It seems that some raters held programs to a higher standard for the 
quality of goals than for progress on them. 


In the third question pair, a question in the planning section asks about whether the 
program has long-term goals, and a question in the results section asks whether the agency 
has made progress in achieving the program’s long-term goals. Yet, in 6 of the 1 15 programs 
(6 percent) judged not to have adequate long-term goals, credit was given for making 
progress on their long-term goals even though the guidance again clearly states that a 
program must receive a "no* if the program received a “no" on the existence of long-term 
outcome goals. 
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The Lack of Performance 
Information Creates 
Challenges in Effectively 
Measuring Program 
Performance 


According to OMB, 1 15 out of 234 programs (49 percent) lacked “specific, 
ambitious, long-term performance goals that focus on outcomes" (Section 
II, Question i). In addition, 0MB found that 90 out of 234 programs (38 
percent) lacked sufficient "independent, quality evaluations” (Section H, 
Question 5). While the validity of these assessments may be subject to 
interpretation and debate, our previous work'* has raised concerns about 
the edacity of federal agencies to produce evaluations of program 
effectiveness. 


The lack of evaluations may in part be driven by how 0MB defined an 
“independent and quality evaluation.” To be independent, nonbiased parties 
with no conflict of interest would conduct the evaluation, but agency 
officials felt that 0MB staff started from the default position that agency- 
sponsored evaluations are, by definition, biased. However, our detailed 
review of 28 PART worksheets found only 7 instances in which 0MB 
explicitly noted its rejection of evaluations: 1 for being too old, 3 for not 
being independent (of the 3, 1 was an internal agency review and 2 were 
conducted by industry groups), and the remaining 3 for not assessing 
program results. 0MB officials have acknowledged that this issue was a 
point of friction with agencies and that beyond GAO, inspectors general, 
and other government reports that were automatically presumed to be 
independent, the independence standard was considered on a case-by-case 
basis. In these case-by-case situations, 0MB staff told us that they looked 
for some degree of detachment and objectivity in the evaluations. For 
example, in the case of one DOE-sponsored evaluation, the 0MB examiner 
attended the meetings of the review group that conducted the evaluation in 
order to see firsthand what sorts of questions the committee posed to the 
department officials. In OMB’s estimation, there was clear independence. 
While 0MB changed the fiscal year 2005 guidance to recognize evaluations 
contracted out to third parties and agency program evaluation offices as 
possibly being sufficiently independent, the new guidance generally 
prohibits evaluations conducted by the program itself from being 
considered “independent.” 

Other reasons evaluation data may be limited include (1) constraints on 
federal agencies' ability to influence program outcomes and reliance on 
states and others for data for programs for which responsibility has 


U.S. Gener^ Accounting Office, Program Evaluation: Agencies Challenged by New 
Demand for Information on Program Results, GAO/GGD-98-53 (Washington, D.C.: Apr. 24 
1998). 
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devolved to the states and (2) the lack of a statutory mandate or dedicated 
funds for evaluation, which agency officials told us can hamper efforts to 
conduct studies or to improve administrative data coUection. 

As we have previously noted, program evaluations can take many forms 
and agencies may obtain evaluations in a variety of ways. Some 
evaluations simply analyze routinely collected program administrative 
data; others involve special surveys. The type of evaluation can greatly 
affect e%^uation cost. Net impact evaluations compare outcomes for 
program participants to those of a randomly assigned control group and are 
designed for situations in which external factors are also known to 
influence those outcomes. However, the adequacy of an evaluation design 
can only be determined relative to the circumstances of the program being 
evaluated. In addition, agencies can obtain evaluations by having program 
or other agency staff collect and analyze the data, by conducting the work 
jointly with program partners (such as state agencies), or by hiring contract 
firms to do so. Our survey of 81 federal agency offices conducting 
evaluations in 1995 of program results found they were most commonly 
located in administrative offices at a major subdivision level or in program 
offices (43 and 30 percent, respectively). Overall, they reported conducting 
51 percent of their studies in-house, while 34 percent were contracted out. 
Depending on the sensitivity of the study questions, agencies can conduct 
credible Internal evaluations by adopting procedures to ensure the 
reliability and validity of data collection and analysis. 


GAO/GGD-98.63. 
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Disagreements on 
Performance 
Information Led to 
Creation of a “Results 
Not Demonstrated” 
Category 


During the PART process 0MB created an additional rating category, 
“results not demonstrated," which was applied to programs regardless of 
their score if 0MB decided that one or both of two conditions pertained: 
(1) 0MB and the agency could not reach agreement on long-term and 
annual performance measures and (2) there was inadequate performance 
information. Almost 50 percent of the 234 programs assessed for fiscal year 
2004 received this rating of “results not demonstrated," ranging from high- 
scoring programs such as the Consumer Product Safety Commission (83) 
to low-scoring programs such as the Department of Veterans Affairs 
Disability Compensation program (15). 0MB officials said that this rating 
was given to programs when agreement could not be reached on long-term 
and annual performance measures and was applied regardless of the 
program’s PART score. Our own review found that 0MB generally assigned 
the “results not demonstrated” rating as described above. 


It is important for users of the PART information to interpret the “results 
not demonstrated” designation as “unknown effectiveness" rather than as 
meaning the program is “ineffective." Having evidence of poor results is not 
the same as lacking evidence of effectiveness. Because the PART guidance 
sets very high standards for obtjuning a “yes," a “no" answer can mean 
either that a program did not meet the standards, or that there is no 
evidence on whether it met the standards. In some readily measured areas, 
lack of evidence of an action may indicate that the standard probably was 
not met However, because effectiveness is often not readily observed, lack 
of evidence on program effectiveness cannot be automatically interpreted 
as meaning that a program is ineffective, Furthermore, an agency might 
have results for goals established under GPRA, but if 0MB and the agency 
could not reach agreement on new or revised goals or measures, then 0MB 
gave a program the rating “results not demonstrated." 


'* However, wc found 8 cases (out of 1 18) programs that were rated as “results not 
demonstrated" despite having both annual and long-term performance goals and evidence 
that these goals were being met 
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Changes to the PART and 
Related Guidance for Fiscal 
Year 2005 Are Meant to 
Address Previously 
Identified Problems 


0MB, recognizing many of toe issues we have just discussed, made 
modifications to the PART instrument and guidance in time for the fiscal 
year 2005 process. 0MB said these changes were based upon lessons 
learned during the fiscal year 2004 process and input from a variety of 
sources, such as PET, IRP, and agency officials, although we were unable to 
determine wWch changes resulted from which recommendations. Although 
the PART as used for fiscal year 2005 is very similar to that for fiscal year 
2004, several questions were added, dropped, merged with other questions, 
or divided into two questions. For example, a research and development 
question used in the fiscal year 2004 PART that received “not applicable” 
answers in 13 out of the 32 cases in which it was applied was dropped from 
the fiscal year 2005 PART. According to 0MB officials, several of the 
multicriteria questions were split into separate questions in order to reduce 
inconsistency, as described earlier in this report. Appendix II provides 
more complete information on the guidance changes between fiscal years 
2004 and 2005. To complement the fiscal year 2005 PART guidance and 
offer strategies for addressing common performance measurement 
challenges, many of which were encountered during toe fiscal year 2004 
process, 0MB released a separate document, titled Pexformance 
Measurement ChaUenges and Strategies, which was the result of a 
workshop in which agencies participated and identified measurement 
challenges and shared best practices and possible work-arounds. 


Instead of reestablishing IRP (which included both agency and 0MB 
representatives) for the fiscal year 2005 process, 0MB officials told us that 
PET (which included only 0MB representatives) would conduct a 
consistency review of 25 percent of all PART evaluations, with at least one 
consistency check per 0MB branch. 0MB also told us that it has asked the 
National Academy of Public Administration (NAPA) to review PET’s 
consistency review for the fiscal year 2005 process; the scope and results of 
that review were not available to us during our audit work.^'^ 0MB senior 
officials cited resources, timing, and the differing needs of the fiscal year 
2004 and 2005 PART processes as reasons for dropping the IRP review. The 
absence of agency particip^on in this important phase of the PART could 
hamper ensuring crucial transparency and credibility. 


Because our audit focused on the fiscal year 2004 PART process, our engagement was not 
limited by OMB's decision to not share its reasoning for shilUng the consistency review from 
IRP to PCT or mu- lack of access to the NAPA review. 
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The Fiscal Year 2004 
PART Process Was a 
Parallel, Competing 
Approach to GPRA’s 
Performance 
Management 
Framework 


The PART was designed for and is used in the executive branch budget 
preparation and review process; as such, the goals and measures used in 
the PART must meet OMB’s needs. However, GPRA — the current statutory 
framework for strategic planning and reporting — is a broader process 
involving the development of strategic and performance goals and 
objectives to be reported in strategic and annual plans. OMB’s desire to 
coDect performance data that better align with budget decision units means 
that the fiscal year 2004 PART process was a parallel competing structure 
to the GPRA framework. Although 0MB acknowledges that GPRA was the 
starting point for the PART, as we explain below, the emphasis is shifting 
such that over time the performance measures developed for the PART and 
used in the budget process may come to drive agencies’ strategic planning 
processes. 

Agencies told us that in some cases, 0MB is replacing PART goals and 
measures for those of GPRA. Effective for fiscal year 2005, OMB’s Circular 
A-1 1 guidance states that performance budgets are to replace GPRA’s 
annual performance plans. Agencies see the change as detrimental to 
planning and reporting under GPRA and as a resource drain since they have 
to respond to both GPRA and PART requirements. Some agency officials 
told us that although the PART can stimulate discussion on program- 
specific performance measurement issues, it is not a substitute for GPRA’s 
outcome-oriented, strate^c look at thematic goals and departmentwide 
program comparisons. Moreover, while the PART does not eliminate the 
departmental strategic plans created under GPRA, many 0MB and agency 
officials told us that the PART is being used to shape the strategic plans. 
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OMB’s Efforts to Link 
Performance Information 
with the Budget Often 
Conflict with Agencies’ 
GPRA Planning Efforts 


OMB guidance and officials made clear that GPRA goals, measures, and 
reports needed to be modified to provide decision makem with program- 
specific, outcome-based performance data that better aligned with the 
budget presentation in the President’s Budget. According to OMB, such 
changes were needed because performance reporting under GPRA had 
evolved into a process separate from budget decision making, with GPRA 
plans organized at too high a level to be meaningful for program-level 
budget analysis and management review. Furthermore, according to OMB 
officials, GPRA plans had too many performance measures, which made it 
difficult to determine an agency’s priorities. However, as some officials 
pointed out, the cumulative effect of adding new PART measures to GPRA 
plans may actually increase the number of measures overall; both agency 
and OMB officials recognize that this is contrary to goals issued by an OMB 
official previously responsible for the PART, indicating his desire to reduce 
the number of GPRA measures by at least 25 percent in at least 70 percent 
of federal departments. As a result of these sometimes-conflicting 
perspectives, agency officials said that responding to both PART and GPRA 
requirements increased their workloads and was a drain on staff resources. 


OMB's most recent Circular A-1 1 guidance clearly requires that each 
agency submit a performance budget for fiscal year 2005 and that this 
should replace the annual GPRA performance plan.*® These performance 
budgets are to include information from the PART assessments, where 
available, including all performance goals used in the assessment of 
program performance done under the PART process. Until all programs 
have been assessed using the PART, the performance budget will also 
include performance goals for agency programs that have not yet been 
assessed using the PART. OMB’s movement from GPRA to PART is further 
evident in the fiscal year 2005 PART guidance stating that while existing 
GPRA performance goals may be a starting point during the development 
of PART performance goals, the GPRA goals in agency GPRA documents 
are to be revised significantly, as needed, to reflect OMB’s instructions for 
developing the PART performance goals. Lastly, this same guidance states 
that GPRA plans should be revised to include any new performance 
measures used in the PART and unnecessary measures should be deleted 
from GPRA plans. 


** Memorandum to the President’s Management Council, “Where We’d Be Proud To Be,” May 
21,2003. 

OMB Circular A-1 1, PreparaHtm, Shibmission, and EocecuMon of the Budget. 
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OMB’s interest in developing more useful program goals is further evident 
in its PART recommendations. Almost half of the fiscal year 2004 PART 
recommendations related to performance assessment — developing 
outcome goals and measures; cost or efficiency measures; and increasing 
the tracking/monitoring of data, improving the tracking/monitoring of data, 
or both. GPRA was generally the starting point for PART discussions about 
goals and measures, and many agency officials told us that 0MB used the 
PART to modify agencies' existing GPRA goals and measures. Agency 
officials reported that the discussions about goals and measures were one 
of the main areas of contention during the PART process. At the same time, 
agency officials acknowledged that (1) sometimes 0MB staff accepted 
current GPRA measures and (2) sometimes the new PART measures and 
goals were improvements over the old GPRA measures — the PART 
measures were more aggressive, more outcome-oriented, more targeted, or 
all of the above. 


Defining a “Unit of 
Analysis” That Is 
Useful for Program- 
Level Budget Analysis 
and Agency Planning 
Purposes Presents 
Challenges 


The appropriate unit of analysis or “program” is not always obvious. What 
0MB determined was useful for a PART assessment did not necessarily 
match agency organization or planning elements. Although the units of 
analysis varied across the PART assessments, OMB’s guidance stated that 
they should be linked to a recognized funding level in the budget. In some 
cases, OMB aggregated separate programs for the purposes of the PART, 
while in offier cases it disaggregated programs. Aggregating programs to tie 
them to discrete funding levels sometimes made it difficult to create a 
limited, but comprehensive, set of measures for programs with multiple 
missions. Disaggregating programs sometimes ignored the 
interdependence of programs by artificially isolating programs from the 
Ijuger contexts in which they operate. Both contributed to the lack of 
available planning and performance information. For example, aggregating 
rural water supply projects as a single unit of analysis may have been a 
logical choice for reviewing related activities, but It created problems in 
identifying planning and performance information useful for the PART 
since these projects are separately administered. In another case, HHS 
officials told us that the P^T program Substance Abuse TVeatment 
Programs of Regional and National Significance is an amalgamation of 
activities funded in a single budget line, not an actual program. They said it 
was a challenge to make these activities look as if they functioned as a 
single program. 


Disaggregating a program too narrowly can create problems by distorting 
its relationship to other programs involved in achieving a common goal, 
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For example, ^ency officials described a homeless program in which 
outreach workers help homeless persons with emergency needs and refer 
them to other agencies for housing and needed services. They said that 
their 0MB counterparts suggested that the program adopt long-term 
outcome measures indicating number of persons housed. Agency officials 
argued that chronically homeless people require many services and that 
this federal program often supports only some of the services needed at the 
initial stages of intervention. The federal program, therefore, could 
contribute to, but not be primarily responsible for, affecting late stages of 
the intervention process, like housing status. 

These issues reveal some of the unresolved tensions between the 
President’s budget and performance initiative — a detailed budget 
perspective — and GPRA— a more strategic planning view. In particular, 
agency officials are concerned with problems in trying to respond to both 
and overwhelmingly agreed that the PART required a large amount of 
agency resources to complete. Moreover, some agency officials said that 
the PART (a program-specific review) is not well suited to one of the key 
purposes of strategic plans — to convey agencywide, long-term goals and 
objectives for all major functions and operations. In addition, the time 
horizons are different for the two initiatives — PART assessments focus on 
program accomplishments to date while GPRA strategic planning is long- 
tenn and prospective in nature. 
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Changes Made to GPRA in 
the PART Process Create 
Uncertainty About 
Opportunities for 
Substantive Input by 
Interested Parties and 
Congressional Stakeholders 


As noted above, PART goals and measures must meet OMB’s needs, while 
GPRA is a broader process involving the development of strategic and 
performance goals and objectives to be reported in strategic and annual 
plans. As a phased reform, GPRA required development of the planning 
framework firet, but also explicitly encouraged links to the budget^® Our 
work has shown that under GPRA agencies have made significant 
progress.®* Additionally, GPRA requires agencies to consult with Congress 
and solicit the views of other stakeholders as they develop their strategic 
plans.®® We have previously reported®® that stakeholder involvement 
appears cridcai for getting consensus on goals and measures. Stakeholder 
involvement can be particularly important for federal agencies because 
they operate in a complex political environment in which legislative 
mandates are often broadly stated and some stakeholders may strongly 
disagree about the agency’s mission and goals. 


The relationship between the PART and its process and the broader GPRA 
strategic planning process is still evolving. Some tension between the level 
of stakeholder involvement in the development of performance measures 
in the GPRA strategic planning process and the process of developing 
performance measures for the PART is inevitable. Compared to the 
relatively open-ended GPRA process any budget fonnulation process is 
likely to seem closed. An agency’s communication with stakeholders, 
including Congress, about goals and measures created or modified during 
the formulation of the President’s budget is likely to be less than during the 
development of the agency’s own strategic or performance plan. Since 
different stakeholders have different needs and no one set of goals and 
measures can serve all purposes, the PART can complement GPRA but 
should not replace it. 


* 31 U.S.C. § 1115(a) (2003). 

U.S. General Accounting Office, Managing for Results: Agency Progress in Linking 
Performance Plans With Budgets and Financial Statements, GAO-02-236 (Washington, 
D.C.: Jan. 4, 2002). 

“ 5 U.S.C. § 306(<1) (2003). 

“ t).S. General Accounting Office, Agencies' Strategic Plans Under GPRA: Key Questions to 
Facilitate Congressional Review (Version 1), GAO/GGD-10.1.16 (Washington, D.C,: May 
1997). 
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Although these tensions between the need for internal deliberations and 
broader consultations are inevitable, if the PART is to be accepted as a 
credible element in the development of the President’s budget proposal, 
congressional understanding and acceptance of the tool and its analysis 
will be important. In order for performance information to more fully 
inform resource allocations, decision makers must also feel comfortable 
wiUi the appropriateness and accuracy of the performance information and 
measures associated witb these goals. It is unlikely that decision makers 
will use performance information unless they believe it is credible and 
reliable and reflects a consensus about performance goals among a 
community of interested parties. Similarly, the measures used to 
demonstrate progress toward a goal, no matter how worthwhile, cannot 
serve the interests of a single stakeholder or purpose without potentially 
discouraging use of this information by others. 

While it is still too soon to know whether OMB-directed measures will 
satisfy the needs of other stakeholders and GPRA’s broader planning 
purposes, several appropriations subcommittees have stated, in their 
^propriations hearings, the need to link the PART with congressional 
oversight. For example, the House Committee on Appropriations, 
Subcommittee on the Department of the Interior and Related Agencies 
notes that while it supports the PMA, the costs of initiatives associated 
with it have generally not been requested in annual budget justifications or 
through reprogramming procedures.^^ The Subcommittee, therefore, has 
been unable to evaluate the costs, benefits, and effectiveness of these 
initiatives or to weigh the priority that these initiatives should receive as 
compared with ongoing programs funded in the Interior Appropriations 
bill. Similarly, the House Report on Treasury and Transportation 
Appropriations included a statement in support of the PART, but noted that 
the administration’s efforts must be linked with the oversight of Congress 
to maximize the utility of the PART process, and that if the administration 
treats as privileged or confidential the details of its rating process, it is less 
likely that Congress will use those results in deciding which programs to 
fund. Moreover, the Subcommittee said It expects 0MB to involve the 
House and Senate Committees on Appropriations in the development of the 
PART ratings at all stages in the process.®^ 


'' H.R. Rep. No. 108-196, p. 8 (2003). 

* H.R, Rep. No. 108-243, pp. 168-69 (2003). 
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While Congress has a number of opportunities to provide its perspective on 
performance issues and performance goals, such as when it establishes or 
reauthorizes a new program, during the annual appropriations process, and 
in its overeight of federal operations, opportunities exist for Congress to 
more systematically articulate performance goals and outcomes for key 
programs of m^or concern and to allow for timely congressional input in 
the selection of the PART programs to be assessed. 


Conclusions and 
General Observations 


0MB, through its development and use of the PART, has more explicitly 
infused performance information into the budget formulation process; 
increased the attention paid to evaluation and performance information; 
and ultimately, we hope, increased the value of this information to decision 
makers and other stakeholders. By linking performance information to the 
budget process, 0MB has provided agencies with a powerful incentive for 
improving data quality and availability. The level of effort and involvement 
by senior 0MB officials and staff clearly signals the importance of this 
strategy in meeting the priorities outlined in the PMA. 0MB should be 
credited with opening up for scrutiny — and potential criticism — its review 
of key areas of federal program performance and then making its 
assessments available to a potentially wider audience through its Web site. 


While the PART clearly serves the needs of 0MB in budget formulation, 
questions remain about whether it serves the needs of other key 
stakeholders. The PART could be strengthened to enhance its credibility 
and prospects for sustainability by such actions as (1) improving agencies’ 
and OMB’s edacity to cope with the demands of the PART, (2) 
strengthening the PART guidance, (3) expanding the base of credible 
performance information by strategically focusing evaluation resources, 
(4) selecting programs for assessment to facilitate crosscutting 
comparisons and trade-offs, (5) broadening the dialogue with 
congressional stakeholders, and (6) articulating and implementing a 
complementary relationship between PART and GPRA. 


OMB’s ambitious schedule for assessing all federal programs by the fiscal 
year 2008 President’s Budget will require a tremendous commitment of 
OMB’s and agencies’ resources. Implementation of the PART 
recommendations vrill be a longer-term and potentially more significant 
result of the PART process than the scores and ratings. No less important 
will be OMB's involvement both in encoura^ng agency progress and in 
signaling its continuing commitment to improving program management 
and results through the PART. OMB has created a template by which 
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agencies report on the status of the recommendations and left follow-up on 
the recommendations to each RMO. However, there is no single focal point 
for evaluating progress and the results of agency efforts govemmentwide; 
without this it will be difficult for 0MB to judge the efficacy of the PART 
and to know whether the increased workload and trade-offs made with 
other activities is a good Investment of 0MB and agency resources. 

The goal of the PART is to evaluate programs systematically, consistently, 
and trar^parently, but in practice, the tool requires 0MB staff to use 
independent Judgment In interpreting the guidance and in making yes or no 
decisions for what are often complex federal programs. These difficulties 
are compounded by poor or partial program performance data. Therefore, 
it is not surprising that we found inconsistencies in our analysis of the 
fiscal year 2004 PART assessments. Recognizing the inherent limitations of 
any tool to provide a single performance answer or judgment on complex 
federal programs with multiple goals, continued improvements in the PART 
guidance, with examples throughout, can nonetheless help encourage a 
higher level of consistency as well as transparency. 

The PART requires more performance and evaluation information than 
agencies currently have, as demonstrated by the fact that 0MB rated over 
50 percent of the programs for fiscal year 2004 as “results not 
demonstrated” because they “did not have adequate performance goals" or 
“had not yet collected data to provide evidence of results.” In the past, we 
too have noted limitations in the quality of agency performance and 
evaluation information and in agency capacity to produce rigorous 
evaluations of program effectiveness. Furthermore, our work has shown 
that few agencies deployed the rigorous research methods required to 
attribute changes in underlying outcomes to program activities. However, 
program evaluation information often requires large amounts of agency 
resources to produce, and the agency and 0MB may not agree on what is 
important to measure, particularly when a set of measures cannot serve 
multiple purposes. Agreement on what are a department or agency’s 
critical, high-risk programs and how best to ev^uate them could help 
leverage limited resources and help determine what are the most important 
program evaluation data to collect 

Federal programs are designed and implemented in dynamic environments 
where competing program priorities and stakeholders’ needs must be 
balanced continually and new needs must be addressed. GPRA is a broad 
legislative framework that was designed to be consultative with Congress 
and other stakeholders and allows for varying uses of performance 
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information, while the PART applies evaluation information to support 
decisions and program reviews during the executive budget formulation 
proce^. While the PART reflects the administration's management 
principles and the priority given to using performance information in 
OMB’s decision-making process, its focus on program-level assessments 
cannot substitute for the inclusive, crosscutting strategic planning required 
by GPRA. Moreover, GPRA can anchor the review of programs by providing 
an overall strategic context for programs’ contributions toward agency 
goals. Although PART and GPRA serve different needs, a strategy for 
integrating the two could help strengthen both. 

Opportunities exist to develop a more strategic approach to the selection 
and prioritization of areas to be assessed under the PART process, 
Targeting PART assessments based on such factors as the relative 
priorities, costs, and risks associated with related clusters of programs and 
activities could not only help ration scarce analytic resources but could 
also focus decision makers’ attention on the most pressing policy and 
program issues. Moreover, such an approach could facilitate the use of 
PART assessments to review the relative contributions of similar programs 
to common or crosscutting goals and outcomes. 

As part of the President’s budget preparation, the PART clearly must serve 
the President’s interests. However, it is unlikely that the broad range of 
actors whose input is critical to decisions will use performance information 
unless they believe it is credible and reliable and reflects a consensus about 
performance goals among a community of interested parties, Similarly, the 
measures used to demonstrate progress toward a goal, no matter how 
worthwhile, cannot appear to serve a single set of interests without 
potentially discouraging use of this information by others. If the President 
or 0MB wants the PART and its results to be considered in the 
congressional debate, it will be important for 0MB to (1) involve 
congressional stakeholders early in providing input on the focus of the 
assessments; (2) clarify any significant limitations in the assessments as 
well as the underlying performance information; and (3) initiate 
discussions with key congressional committees about how they can best 
take advantage of and leverage PART information in authorizations, 
appropriations, and oversight processes. 

As we have previously reported, effective congressional oversight can help 
improve federal performance by examining the program structures 
agencies use to deliver products and services to ensure that the best, most 
cost-effective mix of strategies is in place to meet agency and national 
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goais. While Congress has a number of opportunities to provide its 
perspective on performimce issues and performjuice goals, such as when it 
establishes or reauthorizes a new program, during the annual 
appropriations process, and in its oversight of federal operations, a more 
systematic £qf>proach could aUow Congress to better articulate performance 
goals outcomes for key programs of m^or concern. Such an approach 
could also facilitate OMB’s understanding of congressional priorities and 
concerns and, as a result, increase the usefulness of the PART in budget 
deliberations. 


Matter for 

Congressional 

Consideration 


In order to facilitate an understanding of congressional priorities and 
concerns, we suggest that Congress consider the need for a strategy that 
could include (1) establishing a vehicle for communicating performance 
goals and measures for key congressional priorities and concerns; 

(2) developing a more structured oversight agenda to pennit a more 
coordinated congressional perspective on crosscutting programs and 
policies; and (3) using such an agenda to inform its authorization, 
oversight, and appropriations processes. 


Recommendations 
Executive Action 


for seven recommendations to 0MB for building on and improving the 

first year’s experience with the PART and its process. We recommend that 
the Director of 0MB take the following actions: 


• Centrally monitor agency implementation and progress on PART 
recommendations and report such progress in OMB’s budget 
submission to Congress. Govemmentwide councils may be effective 
vehicles for assisting OMB in these efforts. 

• Continue to improve the PART guidance by (1) expanding the 
discussion of how the unit of analysis is to be determined to include 
trade-offs made when defining a unit of analysis, implications of how the 
unit of analysis is defined, or both; (2) clarifying when output versus 
outcome measures are acceptable; and (3) better defining an 
“independent, quality evaluation.” 

• Clarify OMB’s expectations to agencies regarding the allocation of 
scarce evaluation resources among programs, the timing of such 
evaluiUions, as well as the evaluation strategies it wants for the 
purposes of the PART, and consider using internal agency evaluations as 
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evidence on a case-by-case basis — whether conducted by agencies, 
contractor, or other parties. 

• Reconsider plans for 100 percent coverage of federal programs and, 
instead, target for review a significant percentage of mi^or and 
meaningful government programs based on such factors as tlie relative 
priorities, costs, and risks associated with related clusters of programs 
and activities. 

• Maximize the opportunity to review similar programs or activities in the 
same year to facilitate comparisons and trade-offs. 

• Attempt to generate, early in the PART process, an ongoing, meaningful 
dialogue with congressional appropriations, authorization, and 
oversight committees about what they consider to be the most 
important performance issues and program areas warranting review. 

• Seek to achieve the greatest benefit from both GPRA juid PART by 
articulating and implementing an integrated, complementary 
relationship between the two. 


Agency ComiYlCntS provided a draft of this report to 0MB for its review and comment. 

0MB generally agreed with our findings, conclusions, and 
recommendations. In addition, 0MB outlined actions it is taking to address 
many of our recommendations, including refining the process for 
monitoring agencies’ progress in implementing the PART 
recommendations, seeking opportunities for dialogue with Congress on 
agencies’ performance, and continuing to improve executive branch 
implementation of GPRA plans and reports. 0MB officials provided a 
number of technical comments and clarifications, which we incorporated 
as appropriate to ensure the accuracy of our report. OMB’s comments 
appear in appendix IV. We also received technical comments on excerpts of 
the draft provided to the Departments of the interior, Energy, and Health 
and Human Services. Comments received from the Departments of Energy 
and the Interior were incorporated as appropriate. The Department of 
Health and Hum^ Services had no comments. 

OMB noted that performance information gleaned from the PART process 
has not only informed budget decisions but has also helped direct program 
management, identified opportunities to improve program design, and 
promoted accountability. We agree. As shown in figure 1 in our report, we 
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found that 82 percent of PART recommendations addressed program 
assessment^ design, and management issues; only 18 percent of the 
recommendations had a direct link to funding matters. 


We are sending copies of this report to the Director of 0MB, appropriate 
congressional committees, and other interested members of Congress. We 
will also make copies avaikUiIe to others upon request. In addition, the 
report will be available at no charge on the GAO Web site at 
http:/Avww.gao.gov. 

If you or your staff have questions about this report, please contact Paul 
Posner at (202) 512-9573 or posnerp@gao.gov. An additional contact and 
key contributors to this report are listed in appendix V. 



David M. Walker 
Comptroller General 
of the United States 
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Appendix I 

Scope and Methodology 


To address the objectives in this report, we reviewed Office of Management 
and Budget (0MB) materials and presentations on the development and 
implementation of the Program Assessment Rating Tool (PART) as well as 
the results of the PART assessments. Our review of materials included 
instructions for using PART, OMB’s testimony concerning PART, and public 
remarks made by 0MB officials at relevant conferences and training. We 
also reviewed PART-related information on OMB’s Web site, including the 
0MB worksheets used to support the assessments, and attended OMB’s 
PART training for the fiscal year 2004 process. 

For this report, we focused on the process and final results of the fiscal 
year 2004 PART process, but also looked at the initial stages of the fiscal 
year 2005 process. We compared the PART guidance for both years and 
asked agency and 0MB staff to discuss generally the differences between 
the 2 fiscal years. We did not review the final results for the fiscal year 2005 
PART, which are embargoed until the publication of the President’s fiscd 
year 2005 budget request. For the same reasons, we did not review the 
results of any reassessments conducted for fiscal year 2005 on pro^^ams 
originally assessed for fiscal year 2004. This report presents the 
experiences of staff from the three departments and 0MB officials who we 
interviewed. We did not directly observe the PART process (for either year) 
in operation nor did we independently verify the PART assessments as 
posted on OMB's Web site or the program or financial information 
contained in the documents provided as evidence for the PART 
assessments. We did, however, take several steps to ensure that we reliably 
downloaded and combined the PART summaries and worksheets with our 
budget and recommendation classifications. Our steps included (1) having 
the computer programs we used to create and process our consolidated 
dataset verified by a second programmer; (2) having transcribed data 
elements from all programs checked back to source files; and (3) having 
selected, computer-processed data elements checked back to source files 
for a random sample of programs and also for specific programs identified 
in our analyses. 

To better understand the universe of programs 0MB assessed for fiscal 
year 2004, we developed overall profiles of PART results and examined 
relationships between such characteristics as type of program, type of 
recommendation, overall rating, total PART score, and answers for each 
question on PART. This review enabled us to generally confirm some 
information previously reported by 0MB, for example, that PART scores do 
not automatically determine proposed funding and that grant programs 
scored lower overall than other types of programs. It also allowed us to 
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Appendix I 

Scope and Methodology 


select a sample of programs for more in-depth review, and this sample was 
used to determine which 0MB and agency officials we interviewed. 

To gain a better understanding of the PART process at both 0MB and 
agencies, to mform our examination of the useftilness of PART as an 
evaluation tool, and to obUun various perspectives on the relationship 
between PART and GPRA, we interviewed officials at 0MB and three 
selected departments. At OMB, we interviewed a range of staff, such as 
associate directors, deputy assistant directors, branch chiefs, and 
examiners. Specifically, we interviewed staff in two Resource Management 
Office (RMO). In the Human Resources Programs RMO, we spoke with 
staff from the Health Division and the Education and Human Resources 
Division. In the Natural Resources, Energy and Science RMO we 
interviewed staff from the Energy and Interior Branches. In addition, we 
obtained the views of two groups within OMB that were convened 
specifically for the PART process: the Performance Evaluation Team (PET) 
and the Interagency Review Panel (IRP). The IRP included agency officials 
in addition to staff from OMB. 

The three departments for which we reviewed the PART process were the 
Department of Energy (DOE), the Department of Health and Human 
Services (HHS), and the Department of the Interior (DOI). We selected 
these three departments based on our data analysis of program types. The 
departments selected and their agencies had a variety of program types 
(e.g., block/formula grants, competitive grants, direct federal, and research 
and development) that were subject to PART and could provide us with a 
broad-based perspective on how PART was applied to different programs 
employing diverse tools of government. We also chose these three 
departments because they had programs under PART review within the 
two RMOs at OMB where we did more extensive interviewing, thus 
enabling us to develop a more in-depth understanding of how the PART 
process operated for a subset of programs. We used this information to 
complement our broader profiling of all 234 programs assessed. Within 
DOE we studied the experiences of the Office of Science, the Office of 
Energy Efficiency and Renewable Energy, and the Office of Fossil Energy. 
Within HHS, we studied the experiences of the Administration for Children 
and Families, the Health Resources and Services Administration, and the 
Subsfruice Abuse and Mental HesJth Services Administration. Within DOI, 
we studied the experiences of the Bureau of Land Management, the Bureau 
of Indian Affairs, and the National Park Service. We interviewed planning, 
budget, and program staff within each of the nine agencies as well as those 


Page 40 


GAO-04-174 PerformaBce Budgeting 





165 


Appendix 1 

Scope and Methodology 


at the department level We also reviewed relevant supporting materials 
provided by these departments in conjunction with these interviews. 

To allow us to describe how PART was used in fiscal year 2004 to influence 
changes in ftiture performance, we created a consolidated dataset in which 
we classified recommendations 0MB made by three areas in need of 
improvement: (1) program design, (2) program management, and (3) 
program assessment. A fourth category was created for those 
recommendations that involved funding issues. We created a consolidated 
dataset of information from our analysis of recommendations and selected 
mfomnation from the PART program summary page and worksheet for 
each program.* 

In addition, for ^proximately 95 percent of the programs, we identified 
whether the basis for program funding was mandatory or discretionaiy. It 
was important to separate discretionary and mandatory programs in our 
review of PART'S potential influence on the President’s budget proposals 
because funding for mandatory programs is determined through 
authorizations, not through the annual appropriations process. Of the 234 
programs that 0MB assessed for fiscal year 2004, we identified 27 
mandatory programs and 196 discretionary, but could not categorize 11 
programs as solely mandatory or discretionary because they were too 
mixed to classify.^ 

For discretionary programs, we explored the relationship between PART 
results and proposed budget changes in a series of regression analyses.^ 
Using statistical analysis, we found that PART scores influenced proposed 


' The PART program summary sheets are included in the Budget of the United States 
Government, Fiscal yearS004, Performance and Management Assessments (Washington, 
D.C.; February 2003). The summary sheets and worksheets for the 234 programs are on 
OMB’s Web site: hl^://www.whitehouse.gov/omb/budget/fV2004/pma.htmi. 

® These 1 i programs are animal welfare, food aid, multifamily housing direct loans and 
rental assistance, rural electric utili^ loans and guarantees, and rural water and wastewater 
grants and loans programs in the Department of Agriculture; the nursing education loan 
repayment and scholarship program in HHS; the methane hydrates program in DOE; the 
reclamaUon hydropower program in DOI; the long-term guarantees program In the U.S. 
Export-Import Bank; and the clunate change and development assistance/i>opulation 
programs in the Agency for Internationa) Development. 

* We tested the regression on mandatory programs and as expected the results showed no 
relationship between the PART scores and the level of funding proposed In the President’s 
Budget. 
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funding changes for discretionary programs; however, a large amount of 
variability in these changes remains unexplained. We examined proposed 
funding changes between fiscal years 2003 and 2004 (measured by 
percentage change) and the relationship to PART scores for the programs 
assessed in the fiscal year 2004 President’s Budget. These scores are the 
weighted sums of scores for four PART categories: Program Purpose and 
Design, Strategic Planning, Program Management, and Program Results 
and Accountability. The corresponding weights assigned by 0MB are 0.2, 
0.1, 0.2, and 0.5, respectively.^ Tables in this appendix report regression 
results obtained using the method of least squares with heteroskedasticity- 
corrected standard errors.® The same estimation method is used 
throughout this analysis. 

Overall PART scores have a positive and statistically significant effect on 
discretionary program funding. The programs evaluated by 0MB include 
both mandatory and discretionary programs. Regression results for 
mandatory programs showed — as expected — no relationship between 
PART scores and the level of funding in the President's Budget proposal, 
Assessment ratings, however, can potentially affect the funding for 
discretionary programs either in the President’s Budget proposal or in 
congressional deliberations on spending bills.® Table 3 reports the 
regression results for discretionary programs. 


* Budget of the United Stales Government, Fiscal Year 2004, Performance and 
Management Assessments, 10. 

‘ For a discussion of this method, see W.H. Greene, Econometric Analysis, Section 10.3 
(Upper Saddle River, N.J-: Prentice Hall, 2003). 

* Budget of the United States Government, Fiscal Year 2001, A Citizen^ Guide to the 
Federal Budget (Washington, D.C.: February 2000), 

htlp://w3.access.gpo.gov/usbudgcL/fy2001/guide03.htmI, (downloaded Aprii 2003), 2. 
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Table 3: The Effect of Overall PART Score on Proposed Funding Changes 
(Discretionary Programs) 


Variable 

Coefficient 

estimate 

Robust 
standard error 

t-StatistIc 

P-value 

Overall RART score 

0.536 

0.159 

3.38 

0.001 

Constant 

-25.671 

8.682 

-2.96 

0.003 

Souree. OAQ analytls ol OM8 dsia 


Notes: R-sqoared « 0.058, Prob-F = 0.001 , N = 196. Originaliy we identilied 197 discretwnary 
programs. However, no liscal year 2004 budget estimate is reported tor the Disclosed Worker 
Assistance program due to grant consolidation at the Department of Labor. {Budget of the United 
States Government, Fiscal Year 2004, Performance and Managament Assessments 
D.C.: February 2003), 191.) This reduced the number of discretionary programs to 196. 

The estimated coefficient of the overaU score is positive and significant. 
These results show that the aggregate PART score has a positive and 
statistically significant effect on the proposed change in discretionary 
programs’ budget, suggesting that programs with better scores are more 
likely to receive larger proposed budget increases. 

To examine the effect of program size on our results, we divided all 
programs equally into three groups — small, medium, and large — based on 
their fiscal year 2003 ftmding estimate. Regressions similar to those 
reported in table 3 were then performed for discretionary programs in each 
group. The results, reported in tables 4, 5, and 6 suggest that the 
statistically significant effect of overall scores on budget outcomes exists 
only for the smaller programs. The estimated coefficient of the overall 
score for large programs, which is significant but only at the 10 percent 
level, reflects an outlier’ Once this outlier is dropped, the estimated 
coefficient becomes statistically insignificant. 


’ The outlier is the Community Oriented Policing Services program with an estimated 77 
percent reducUon in funding (see 0MB, Bwigel of the U.S. Goveraramt, Fiscal Year 2004, 
Performance and Management Assessments, (Washington, D.C.; Febniary 2003), 178). The 
outlier in this case is identified using scatter plot and estimating with and without the 
outlier. The reported results ftw small and medium programs are not outlier driven. 
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Table 4: The Effect of Overall PART Score on Proposed Funding Changes (Small 
Discretionary Programs) 


Variable 

Coefficient 

estimate 

Robust 
standard error 

t-StatistIc 

P-value 

Overall PART score 

1.074 

0.404 

2.66 

0.010 

Constant 

-50.523 

21.155 

-2.39 

0.020 

Souics OW ansty^ ol 0MB OaU 

Note: R-squared = 0.092, Prob-F 

= 0.01. N = 71. 





Table 5: The Effect of Overall PART Score on Proposed Funding Changes (Medium- 
Size Discretionary Programs) 


Coefficient Robust 

Variable estimate standard error t-Statlstic P-value 

Overall PART score 0.306 0.188 1.62 0.109 

Constant -17.984 12.480 -1.44 0.154 

Souica: OAO analysis oi 0MB daia. 

Note: R-squarad » 0.039. Prob-F = 0.109, N » 67. 


Table 6: The Effect of Overall PART Score on Proposed Funding Changes (Large 
Discretionary Programs) 


Variable 

Coefficient 

estimate 

Robust 
standard error 

t-Statistlc 

P-valuB 

Overall PART score 

0.194 

0.109 

1.77 

0.082 

Constant 

•8.216 

7.778 

•1.06 

0.295 

Sourct' OAO snaiytis ot 0MB Csia. 

Note: R-squared * 0.057, Prob-F 

= 0.082, N s 58. 





The statistical analysis suggests that among the four components of the 
PART questionnaire, program purpose, management, and results have 
statistically significant effects on proposed funding changes, but the effects 
of program purpose and results are more robust across the estimated 
models. The overall score is a weighted average of four components: 
Program Purpose and Design, Strategic Planning, Progrmn Management, 
and Program Results and Accountability.® To identify which of the four 
components contribute to the significant relationship observed here, we 
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examined the effect of each on proposed changes in programs' funding 
levels. Tables 7 and 8 show estimates from regressions of the proposed 
funding change on purpose, planning, management, and results scores for 
all discretionary programs as well as small discretionary programs alone. 


Table 7: The Effect of PART Component Scores on Proposed Funding Changes (All 
Discretionary Programs) 


Variable 

Coefficient Robust 

estimate standard eiror 

t-Statistic 

P-value 

Purpose 

0.325 

0.127 

2.56 

0.011 

Plan 

-0.259 

0.199 

-1.30 

0.194 

Management 

0.191 

0,117 

1.63 

0,105 

Results 

0.363 

0.205 

1.77 

0,078 

Constant 

•33.096 

14.136 

•2.34 

0.020 

Sou'ca' aAOansl)4n of OMB <lal« 

Note: R-squared = 0.087. Prob-F a 0.003. N « 196. 





Table 6; The Effect of PART Component Scores on Proposed Funding Changes 
(Small Discretionary Programs) 


Variable 

Coefficient 

estimate 

Robust 
standard error 

t-Statistic 

P-value 

Purpose 

0.223 

0.274 

0.81 

0.419 

Plan 

-0.671 

0,543 

•1.24 

0.221 

Management 

0.547 

0,304 

1.80 

0.077 

Results 

0,956 

0.534 

1.79 

0.078 

Constant 

-42.455 

34.800 

-1.22 

0.227 

S«uR«. QAO *n«ly«ts el me dale. 

Note: R-squared * 0.149. Prob-F a 0.043, N a 

71. 




These results suggest that among the four components, program purpose, 
management, and results are more likely to affect the proposed budget 
changes for discretionary programs. When all discretionary programs are 


* Budget of the United States Government, Fiscal Year 2004, Performance and 
Management Assessments, 10. 
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included, the ^timated coefficients are positive and significant for results 
(at the 10 percent level) and purpose, When only the small discretionary 
programs are included, the estimated coefficients are positive and 
significant for both management and results (at the 10 percent level). We 
also estimated the above regression for medium and large programs, but 
coefficient estimates were not statistically significant, except for the 
estimated coefficient of purpose for medium programs. 

PART scores explain at most about 15 percent of the proposed ftmding 
changes, leaving a large portion of the variability in proposed funding 
changes unexplained. This suggests that most of the variance is due to 
institutional factors, program specifics, and other unquantifiable factors. 
The coefficient of determination (or is used to measure the proportion 
of the total variation in the regression’s dependent variable that is 
explained by the variation in the regressors (independent variables).^ The 
maximum value of this measure across all estimated regressions is about 
15 percent. 

Similar analyses were carried out for changes in the proposed budget for 
fiscal year 2004 and congressionally appropriated amounts in fiscal year 
2002. Results were qualitatively similar to those reported here. 

To assess the strengths and weaknesses of PART as an evaluation tool and 
the consistency with which it was applied, we analyzed data from all 234 
programs that 0MB reviewed using PART for fiscal year 2004. As part of 
our examination of the consistency with which PART was applied to 
programs, we also focused on a subset of programs to assess the way in 
which certain measurement issues were addressed across those programs. 
The issues were selected from those identified in interviews with officials 
from the selected agencies described above and our own review of the 
PART program summaries and worksheets. Measurement issues included 
acceptance of output versus outcome measures of annual and long-^term 
goals, types of studies accepted as program evaluations, acknowledgment 
of related programs, and justifications for judging a PART question as “not 
applicable.” Programs were selected that formed clusters, each addressing 
a similar goal or shared a structural similarity pertinent to performance 
measurement, to examine whether PART assessment issues were handled 
similarly across programs when expected. We reviewed the worksheets 
and compared the treatment of assessment issues across specific questions 


^ See Greene, 33. 
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witiiin and across programs in a cluster to identify potential 
inconsistencies in how the tool was applied. We reviewed a total of 28 
programs in nine clusters. The nine clusters are food safety, water supply, 
military equipment procurement, provision of health care, statistical 
agencies, block grants to assist vulnerable populations, energy research 
programs, wildland fire management, and disability compensation. 

With the exception of our summary analyses of all 234 programs, the 
information obtained from OMB and agency interviews, related material, 
and review of selected programs is not generalizable to the PART process 
for all 234 programs reviewed in fiscal year 2004. We conducted our review 
from May through October 2003 in accordance with generally accepted 
government auditing standards. 
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Below we have reproduced OMB’s fiscal year 2004 PART instrument. We 
have also included the comparison of fiscal year 2004 and fiscal year 2005 
PART questions that appeared in the fiscal year 2005 PART guidance (see 
table 9). 


Section I: Program i 
Purpose & Design (Yes, j 
No, N/A) 

3. 


Is the program purpose clear? 

Does the program address a specific interest, problem or need? 

Is the program designed to have a significant impact in addressing the 
interest, problem or need? 


4. Is the program designed to make a unique contribution in addressing 
the interest, problem or need (i.e., not needlessly redundant of any 
other Federal, state, local or private efforts)? 

5. Is the program optimally designed to address the interest, problem or 
need? 


Specific Program Purpose & 
Design Questions by 
Program l^pe 


Research and Development Programs 

6. (RD. 1) Does the program effectively articulate potential public 
benefits? 


7. (RD. 2) If an industry-related problem, can the program explain how 
the market fails to motivate private investment? 


Section II: Strategic 
Planning (Yes, No, N/A) 


Does the program have a limited number of specific, ambitious long- 
term perfonnance goals that focus on outcomes and meaningfully 
reflect the purpose of the program? 


2. Does the program have a limited number of annual performance goals 
that demonstrate progress toward achieving the long-term goals? 

3. Do ail partners (grantees, subgrantees, contractors, etc.) support 
program-planning efforts by committing to the annual and/or long-term 
goals of the program? 
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4. Does the program collaborate and coordinate effectively with related 
programs that share similar goals and objectives? 

6. Are independent and quality evaluations of sufficient scope conducted 
on a regular basis or as needed to fill gaps in performance information 
to support program improvements and evaluate effectiveness? 

6. Is the program budget aligned with the program goals in such a way 
that the impact of funding, policy, and legislative changes on 
performance is readily known? 

7. Has the program taken meaningful steps to address its strategic 
planning deficiencies? 


Specific Strategic Plarming Regulatory-Based Programs 
Questions by Program lype 

8. (RD. 1) Are all regulations issued by the program/agency necessary to 
meet the stated goals of the program, and do aH regulations clearly 
indicate how the rules contribute to achievement of the goals? 

Capital Assets and Service Acquisition Programs 

8. (Cap. 1) Are acquisition program plans adjusted in response to 
performance data and changing conditions? 

9. (Cj^. 2) Has the agency/i?rogram conducted a recent, meaningful, 
credible analysis of alternatives that includes trade-offs between cost, 
schedule and performance goals? 

Research and Development Programs 

8. (RD. 1) Is evaluation of the program’s continuing relevance to mission, 
fields of science, and other “customer” needs conducted on a regular 
basis? 

9. (RD. 2) Has the program identified clear priorities? 
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Section III: Program 
Management (Yes, No, 
N/A) 


Does the agency regularly collect timely and credible performance 
information, including information from key program partners, and use 
it to manage the program and improve performance? 

Are Federal managers and program partners (grantees, subgrantees, 
contractore, etc.) held accountable tor cost, schedule and performance 
results? 


3. Are all funds (Federal and partners') obligated in a timely manner and 
spent for the intended purpose? 

4. Does the program have incentives and procedures (e.g., competitive 
sourcing/cost comparisons, IT improvements) to measure and achieve 
efficiencies and cost effectiveness in program execution? 

5. Does the agency estimate and budget for the full annual costs of 
operating the program (including all administrative costs and allocated 
overhead) so that program performance changes are identified with 
changes in funding levete? 

6. Does the program use strong financial management practices? 

7. Has the program taken meaningful steps to address its management 
deficiencies? 


Specific Program 
Management Questions by 
Program T^pe 


Competitive Grant Programs 

8. (Co. 1) Are grant applications independently reviewed based on clear 
criteria (ratlier than eannarked) and are awards made based on results 
of the peer review process? 


9. (Co. 2) Does the grant competition encourage the participation of 
new/first*time grantees through a fair and open application process? 

10. (Co. 3) Does the program have oversight practices that provide 
sufficient knowledge of grantee activities? 

11. (Co. 4) Does the program collect performance data on an annual basis 
and make it available to the public in a transparent and meaningful 
manner? 
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Biock/Formula Grant Programs 

8. (B. 1) Does the program have oversight practices that provide sufficient 
knowledge of grantee activities? 

9. (B. 2) Does the program collect grantee performance data on an annual 
basis and make it available to the public in a transparent and 
meaningful manner? 

Regulatory-Based Programs 

8. (Reg. 1) Did the program seek and take into account the views of 
affected parties including state, local and tribal governments and small 
businesses, in drafting significant regulations? 

9. (Reg. 2) Did the program prepare, where appropriate, a Regulatory 
Impact Analysis that comports with OMB's economic analysis 
guidelines and have these RIA analyses and supporting science and 
economic data been subjected to external peer review by qualified 
specialists? 

10. (Reg. 3) Does the program systematically review its current regulations 
to ensure consistency among all regulations in accomplishing program 
goals? 

1 1 . (Reg. 4) In developing new regulations, are incremental societal costs 
and benefits compart? 

12. (Reg. 5) Did the regulatory changes to the program maximize net 
benefits? 

13. (Reg. 6) Does the program impose the least burden, to the extent 
practicable, on regulated entities, taking into account the costs of 
cumulative final regulations? 

Capital Assets and Service Acquisition Programs 

8. (Cap. 1) Does the program define the required quality, capability, and 
performance objectives of deliverables? 

9. (Cap. 2) Has the program established appropriate, credible, cost and 
schedule goals? 
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10. (Cap. 3) Has the program conducted a recent, credible, cost-benefit 
analysis th^ shows a net benefit? 

1 1. (Cap. 4) Does the program have a comprehensive strategy for risk 
management that appropriately shares risk between the government 
and contractor? 

Credit Programs 

8. (Cr. 1) Is the program managed on an ongoing basis to assure credit 
quality remains sound, collections and disbursements are timely and 
reporting requirements are fulfilled? 

9. (Cr. 2) Does the program consistently meet the requirements of the 
Federal Credit Reform Act of 1990, the Debt Collection Improvement 
Act and applicable guidance under 0MB Circulars A-1, A'34, and A-129? 

10. (Cr. 3) Is the risk of the program to the U.S. Government measured 
effectively? 

Research and Development Programs 

8. (RD. 1) Does the program allocate funds through a competitive, merit- 
based process, or, if not, does it justify funding methods and document 
how quality is maintained? 

9. (RD. 2) Does competition encourage the participation of new/first-time 
performers through a fair and open application process? 

10. (RD. 3) Does the program adequately define appropriate termination 
points and other decision points? 

11. (RD. 4) If the program includes technology development or 
construction or operation of a facility, does the program clearly define 
deliverables and required capability-performance characteristics and 
appropriate, credible cost and schedule goals? 
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Section rV: Program 
Results (Yes, Large 
Extent, Small Extent, 
No) 


Has the program demonstrated adequate progress in achieving its long- 
term outcome goaI(s)? 

• Long-Tferm Goal I: 

Actual Progress achieved toward goal: 


• Long-Term Goal II: 

Target; 

Actual Progress achieved toward goal: 


• Long-Term Goal III; 

Target: 

Actual Progress achieved toward goal: 


Z. Does the program (including program partners) achieve its annual 
performance goals? 


• Key Goal 1: 
Performance T^get: 
Actual Performance: 

• Key Goal II: 
Performance Target: 
Actual Performance: 

• Key Goal m: 
Performance Target; 
Actual Performance: 


Note: Performance targets should r^erence the performance baseline and 

years, e.g. achieve a 5% increase over base of X in $000. 

3. Does the program demonstrate improved efficiencies and cost 
effectiveness in achieving program goals each year? 

4. Does the performance of this program compare favorably to ottier 
programs with simOar purpose and goals? 

5. Do independent and quality evaluations of this program indicate that 
the program is effective and achieving results? 
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Specific Results Questions 
by Program lype 


Regulatoiy-Based Programs 

6. (Reg. 1) Were programmatic goals (and benefits) achieved at the least 
incremental societal cost and did the program maximize net benefits? 


Capital Assets and Service Acquisition Programs 


6. (Cap. i) Were program goals achieved within budgeted costs and 
established schedules? 


Research and Development Programs 

6. (RD. 1) If the program includes construction of a facility, were program 
goals achieved within budgeted costs and established schedules? 



Table 9: 

Side*by*Side of the Fiscal Year 200S PART and the Fiscal Year 2004 PART Questions 



This year’s question (fiscal year 

2005 PART) 

Last year's question (fiscal year 
2004 PART) 

Comment 

1. Program purpose & design 

1.1 

is the program purpose clear? 1 

Same 


1.2 

Does the program address a specific 2 
and existing problem, interest, or 
need? 

Does the program address a specific 
interest, problem or need? 

Wording clarified. 


3 

Is the program designed to have a 
significant impact in addressing the 
interest, problem or need? 

Dropped: “signlficanf worked 
against smalt programs and 
was not clear. 

1.3 

Is the program designed so that it is 4 
not redundant or duplicative of any 
other Federal, state, local or private 
effort? 

Is the program designed to make a 
unique contribution in addressing the 
interest, problem or need (i.e., is not 
needlessly redundant of any other 
Federal, state or, local or private 
effort)? 

Wording clarified. 

1.4 

Is the program design free of major 5 

flaws that would limit the program's 
effectiveness or efficiency? 

Is the program oplimaliy designed to 
address the national, interest, problem 
or need? 

Minor change to clarify focus; 
“optimany" was too broad. 

1.5 

Is the program effectively targeted, so 
that resources will reach intended 
beneficiaries and/or otherwise address 
the program's purpose directly? 


New question to address 
distributional design. 
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This year's question (fiscal year 

2005 PART) 

Last year’s question (fiscal year 
2004 PART) 

Comment 

Specific Program Purpose and Design Questions by Program Type 

Research and Development Programs 


RD.1 

Does the program effectively articulate 
potential pudic benefits? 

Dropped: covered by 1 .2. 


RD.2 

if an industry-related problem, can the 
program explain how the market fails 
to motivate private investment? 

Drqjped; covered by 1.2 and 

1.5. 

il. Strategic planning 

2.1 

Does the program have a limited 1 

number of specific long-term 
performance measures that focus on 
outcomes and meaningfully reflect the 
purpose of the program? 

Does the program have a limited 
number of specific, ambitious long- 
term performance goals that focus on 
outcomes and meaningfully reflect the 
purpose of the program? 

Splits old 11.1 into separate 
questions on existence of (1) 
long-term performance 
measures and (2) targets for 
these measures. Together, the 
measures and targets comprise 
the long-term performance 
goals addressed in last year’s 
question. 

2.2 

Does the program have ambitious 
targets and timeframes for its long- 
term measures? 


Splits old II. 1; see above. 

2.3 

Does the program have a limited 2 

number of specific annual 
performance measures that can 
demonstrate progress toward 
achieving the program's long-term 
goals? 

Does the program have a limited 
number of annual performance goals 
that demonstrate progress toward 
achieving the long-term goals? 

Splits old il.2 into separate 
questions on existence of ( 1 ) 
annual performance measures 
and (2) targets for these 
measures. Together, the 
measures and targets comprise 
the annual perfomtance goats 
addressed in last year's 
question. 

2.4 

Does the program have baselines and 
ambitious targets for its annual 
measures? 


Splits old il.2; see above. 

2.5 

Do all partners (induding grantees, 3 

sub-grantees, contractors, cost- 
sharing partners, and other 
government partners) commit to and 
work toward the annual and/or long- 
term goals of the program? 

Do all partners (grantees, sub- 
grantees, contractors, etc.) support 
program planning efforts by 
committing to the annual and/or long- 
term goals of the program? 

Wording clarified. 


4 

Does the program collaborate and 
coordinate effectively with related 
programs fliat share similar goals and 
objectives? 

Moved to question 3.5, 
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Appendix II 

Tbe Fiscal Year 2004 PAST and Differences 

Between the Fiscal Year 2004 and 2005 PARl^ 

(Continued From Previous Page) 


This year’s question (fiscal year 

2005 PART) 

Last year's question (fiscal year 
2004 PART) 

Comment 

2.6 

Are independent evaluations of 5 

sufficient scope and quality conducted 
on a regular basis or as needed to 
support program improvements and 
evaluate effectiveness and relevance 
to the problem, interest, or need? 

Are independent and quality 
evaluations of sufficient scope 
conducted on a regular basis or as 
needed U> fHI gaps in performance 
information to support program 
improvements and evaluate 
effectiveness? 

Wording clarified. 

2.7 

Are budget requests explicitly tied to 6 
accomplishment of the annual and 
long-term performance goals, and are 
the resource needs presented in a 
complete and transparent manner in 
the program’s budget? 

is the program budget aligned with the 
program goals in such a way that the 
impact of funding, policy, and 
legislative changes on performance is 
readily known? 

Modified. 

2,8 

Has the program taken meaningful 7 

steps to correct its strategic planning 
deficiencies? 

Same. 


Specific Strategic Planning Questions by Program Type 

Regulatory Based Programs 

2.RG1 

Are all regulations issued by the Reg. 1 

program/agency necessary to meet 
the stated goals of the program, and 
do all regulations clearly indicate how 
the rules contribute to achievement of 
the goals? 

Same. 


Capital Assets & Service Acquisition Programs 


Cap. 1 

Are acquisition program plans 
adjusted in response to performance 
data and changing conditions? 

Dropped; covered in 2.CA1 and 
3,CA1. 

2.CA1 

Mas the agency/program conducted a Cap. 2 
recent, meaningful, credible analysis 
of alternatives that includes trade-offs 
between cost, schedule, risk, and 
performance goals and used the 
results to guide the resulting activity? 

Has the agency/program conducted a 
recent, meaningful, credible analysis 
of alternatives that includes trade-offs 
between cost, schedule and 
performance goals? 

Minor change. 

R&D Programs 

R&D programs addressing technology 
development or the construction or 
operation ol a facility should answer 

2.CA1, 

2.RD1 

If applicable, does the program assess RO. 1 
and compare the potential benefits of 
efforts within the program to other 
efforts that have similar goals? 

is evaluation of the program’s 
continuing relevance to mission, fields 
of science, and other "customer" 
needs conducted on a regular basis? 

Modified. 

2.RD2 

Does the program use a prioritization RD. 2 
process to guide budget requests and 
funding decisions? 

Has the program identified clear 
priorities? 

Modified. 
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Appendix 11 

The Fiscal Year 2604 PABT and Diiferences 

Between the Fiscal Year 2004 and 2905 PARTs 

(Continued From Previous Page) 


This year's question (fiscal year 

2005 PART) 

Last year’s question (fiscal year 
2004 PART) 

Comment 

iii. Program management 

3.1 

Does the agency regularly colied 1 

timely and credible performance 
information, including information from 
key program partners, and use it to 
manage ttie program and improve 
performance? 

Same. 


3.2 

Are Federal managers and program 2 
partners (including grantees, sub- 
grantees, contractors, cost-sharing 
partners, and other government 
partners) held accountable for cost, 
schedule and performance results? 

Same. 


3.3 

Are funds (Federal and partners’) 3 

obligated in a timely manner and spent 
for the intended purpose? 

Same. 


3.4 

Does the program have procedures 4 
(e.g. competitive sourcing/cost 
comparisons. IT Improvements, 
appropriate incentives) to measure 
and achieve efficiencies and cost 
effectiveness in program execution? 

Same. 


3.5 

Does the program collaborate and 
coordinate effectively with related 
programs? 


Same as old question 2.4. 


5 

Does the agency estimate and budget 
for the full annual costs of operating 
the program (inciuding ali 
administrative costs and allocated 
overhead) so that program 
performance changes are identified 
with changes in funding levels? 

Now covered by guidance for 
question 2.7, 

3.6 

Does the program use strong financial 6 
management pmctices? 

Same. 


3.7 

Has the program taken meaningful 7 

steps to address its management 
deficiencies? 

Same. 


Specific Program Management Questions by Program Type 

Competitive Grant Programs 

3.C01 

Are grants awarded based on a clear Co. 1 
competitive process that includes a 
qualified assessment of merit? 

Are grant applications independently 
reviewed based on clear criteria 
(rather than earmarked) and are 
awards made based on results of the 
peer review process? 

Modified. Guidance also 
captures former question Co. 2. 
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Appendix li 

The Fiscal Year 2004 PAHTf and Differences 
Between the Fiscal Y«ar 2004 and 2005 PABTs 


(Continued From Previous Page) 


This year’s question (fiscal year 
2005 PART) 


Last year’s queatlon (fiscal year 
2004 PART) 

Comment 



Co.2 

Does the grant competition encourage 
the participation of new/first-time 
grantees through a fair and open 
£q}plication process? 

Now considered in guidance for 
answering 3.C01, atrave. 

3.C02 

Does the program have oversight 
practices that provide sufficient 
knowledge of grantee activities? 

Co. 3 

Does the agency have sufficient 
knowledge about grantee activities? 

Wording clarified. 

3,C03 

Does the program collect grantee 
performance data on an annual basis 
and make it available to the public in a 
transparent and meaningful manner? 

Co. 4 

Same. 


Biock/Formula Grant Programs 

3.BF1 

Does the program have oversight 
practices that provide sufficient 
knowledge of grantee activities? 

B. 1 

Same. 


3.BF2 

Does the program collect grantee 
performance data on an annual basis 
and make it available to the public in a 
transparent and meaningful manner? 

8.2 

Same. 


Regulatory Based Programs 

3.RG1 

Did the program seek and take into 
account the views of all affected 
parties (e.g., consumers; large and 
small businesses; State, local and 
tribal governments; beneficiaries; and 
the general public) when developing 
significant regulations? 

Reg. i 

Did the program seek and take into 
account the views of affected parties 
including state, local and tribal 
governments and smalt businesses in 
drafting significant regulations? 

Wording clarified. 

3.RG2 

Did the program prepare adequate 
regulatory Impact analyses if required 
by Executive Order 12866, regulatory 
flexibility analyses if required by the 
Regulatory Flexibility Act and 

SBREFA, and cost-benefit analyses if 
required under the Unfunded 

Mandates Reform Act; and did those 
analyses comply with 0MB 
guidelines? 

Reg. 2 

Did the program prepare, where 
appropriate, a Regulatory Impact 
Analysis (RIA) that comports with 
OMB's economic analysis guidelines 
and have these RIA analyses and 
supporting science and economic data 
been subjected to external peer 
review, as apprqjriate, by qualified 
specialists? 

Minor change. 

3.nG3 

Does the program systematically 
review its current regulations to ensure 
consistency among all regulations in 
accomplishing program goals? 

Reg. 3 

Same. 




Reg. 4 

in developing new regulations, are 
Incremental societal costs and benefits 
compared? 

Merged into new 3.RG4. 
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Appendix II 

The Fiscal Year 2004 PAST and Differences 

Between the Piscai Year 2004 and 2005 PABTs 

(Continued From Piovious Page) 


This year’s question (fiscai year 
2005 PART) 


Last year's question (fiscal year 
2004 PART) 

Comment 

3.RG4 

Are the regulations designed to 
achieve program goals, to the extent 
practtoabte, by maximizing the net 
benefits of its regulatory activity? 

Reg. 5 

Did tite regulatory changes to the 
program maximize net benefits? 

Combines former questions 

Reg. 4, 5, & 6. 



Reg. 6 

Does the program impose the least 
burden, to the extent practicable, on 
regulated entities, taking into account 
the costs of cumulative final 
regulations? 

Merged in to new 3.RG4. 

Capital Assets and Service Acquisition Programs 

3.CA1 

Is the program managed by 
maintaining clearly defined 
deliveries, capabitity/performance 
characteristics, and appropriate, 
credible cost and schedule goals? 



New question, covers old Cap. 
1,2,3, and 4. 



Cap. 1 

Does the program clearly define the 
required quality, capability, and 
performance objectives for 
deliveries arrd required 
capabilities/performance 
characteristics? 

Merged into new 2.CA1 and 
3.CA1, 



Cap 2. 

Has the program established 
appropriate, credible, cost and 
schedule goals? 

Merged into new 2.CA1 and 
3.CA1. 



Cap 3. 

Has the program conducted a recent, 
credible, cost-benefit analysis that 
shows a net benefit? 

Merged into new 2.CA1 and 
3.CA1. 



Cap 4. 

Does the program have a 
comprehensive strategy for risk 
management that appropriately shares 
risk between the government and 
contractor? 

Merged into new 2.CA1 and 
3.CA1, 

Credit Programs 

3.CR1 

is the program managed on an 
ongoing basis to assure credit quality 
remains sound, collections and 
disbursements are timely, and 
reporting requirements are futfitled? 

Cr. 1 

Same. 




Cr.2 

Does the program consistently meet 
the requirements of the Federal Credit 
Reform Act of 1990, the Debt 

Collection Improvement Act and 
applicable guidance under 0MB 
Circulars A-t. A-11. and A-129? 

' Merged into new 3.CR2. 
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Appendix II 

The Fiscal Year 3<KM PABT and Differences 

Between the Fiscal Year 2004 and 2005 PABTs 

(Continued From Previous Page) 


This year’s question (fiscal year 

2005 PART) 


Last year’s question (fiscal year 
2004 PART) 

Comment 

3.CR2 

Do the program's credit models 
adequately provide reliable, 
consistent, accurate and transparent 
estimates o( costs and the risk to the 
Government? 

Cr.3 

Is the hsk of hie program to the U.S. 
Govramment measured effectively? 

Combines former Cr. 2 and 3. 

Research and Development Programs 

R&D programs addressing technology 
developmait or the construction or 
operation of a facility should answer 

3.CA1. R&D programs that use 
competitive grants should answer 

3.C01.C02and C03. 

3.R01 

For R&D programs other than 
competitive grants programs, does the 
program aifocate funds and use 
management processes that maintain 
program quality? 

RD. 1 

Does the program allocate funds 
through a con^etitive, merit-based 
process, or, if not, does it justify 
funding methods and document how 
quality is maintained? 

Modified. 



RD.2 

Does competition encourage the 
participation of new/first-time 
performers through a fair and open 
application process? 

Covered by 3.C01. 



RD.3 

Does the program adequately define 
apprc^riate termination points and 
other decision points? 

Covered by 2.CA1 andS.CAI. 



RD.4 

If the program includes technology 
development or construction or 
operation of a facility, does the 
program clearly define deliverables, 
c^bility/performance characteristics, 
and appropriate, credible cost and 
schedule goals? 

Covered by 2.CA1 andS.CAI. 

IV. Program resuits 

4.1 

Has the program demonstrated 
adequate progress in achieving its 
long-term performance goafs? 

1 

Has the program demonstrated 
adequate progress in achieving its 
long-term outcome goai(s)? 

Minor change. 

4.2 

Does the program (including program 
partners) achieve its annual 
performance goals? 

2 

Same. 


4.3 

Does the program demonstrate 
improved efficiencies or cost 
effectiveness in achieving program 
goals each year? 

3 

Same. 


4.4 

Does the performance of this program 
compare favorably to other programs, 
including government, private, etc., 
with similar purpose and goals? 

4 

Does the performance of this program 
compare favorably to other programs 
with similar purpose and goals? 

Minor change. 
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Appendix I! 

The Fiscal Year 2004 PAST and DlfTerences 
Between the Fiscal Year 2004 and 200$ PAK'n 


(Continued From Previous Page) 


This year’s question (fiscal year 

2005 PART) 

Last year's question (fiscal year 

2004 PART) 

Comment 

4.5 

Do independent evaluations o1 5 

sufficient scope and quality indicate 
that the program is effective and 
achieving results? 

Same. 


Specific Results Questions by Program Type 

Regulatory Based Programs 

4.RG1 

Were programmatic goals {and 
benefits) achieved at the least 
incremental societal cost and did the 
program maximize net benefits? 

Same. 


Capital Assets and Service Acquisition Programs 

4.CA1 

Were program goals achieved within Cap. 1 
budgeted costs and established 
schedules? 

Same. 


Research and Development Programs 


R&D programs addressing technology RD. 1 
development or the construction or 
operation of a facility should answer 

4.CA1. 

If the program indudes construction of 
a facility, were program goafs achieved 
witfiin budgeted costs and established 
schedules? 

Simplified. 


Source. 0MB Webeiio, http //»««>( whiiehouse jov^ome/pefi/SooiSSi pot (Oowt»io»aed Apt 7.2003).S-i2. 
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Appendix IH 

Development of PART 


Fiscal Year 2003 administration’s efforts to link budget and performance began with the 

fiscal year 2003 budget, in which the administration announced the 
“Executive Branch M^agement Scorecard,” a traffic-light grading system 
to report the work of federal agencies in implementing the President’s 
Management Agenda’s five govemmentwide initiatives. Each quarter, 0MB 
assessed agencies achievement toward the “standards of success” — 
specific goals articulated for each of the five initiatives. Since some of the 
five initiatives require continual efforts, 0MB also assessed agencies’ 
progress toward achieving the standards. The fiscal year 2003 President’s 
Budget also included OMB’s assessments of the effectiveness of 130 
programs and a brief explanation of the assessments. According to 0MB, 
the assessments were based on 0MB staffs knowledge of the programs 
and professional judgments; specific criteria were not publicly available 
with which to support OMB’s judgments. 


Fiscal Year 2004 During the spring of 2002, an internal 0MB task force — PET — consisting of 

staff from various 0MB divisions, created PART to make the process of 
rating programs robust and consistent across government programs. 
During the development of PART, 0MB solicited input from interested 
parties both inside and outside the federal government, including GAO and 
congressional staff. PART was tested on 67 programs during a series of 
Spring Review meetings with the 0MB Director, Based on these results and 
other stakeholder feedback, PET recommended a series of refinements to 
PART, such as using a four-point scale in the Results section as opposed to 
the “yes/no” format. Another key change was revising the Program Purpose 
and Design section (Section I) to remove the question “Is the federal role 
critical?" because it was seen as subjective — based on an individual's 
political views. 
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Apjpendix !fl 
Development of PAST 


In July 2002, 0MB issued PART in fmal and accompanying instructions for 
completing the assessments for the President’s fiscal year 2004 budget 
submission. Later that month, 0MB provided a series of training sessions 
on PART for staff from 0MB and agencies. Agencies received completed 
PART assessments during early September 2002 and submitted written 
appeals to 0MB by mid-September. 0MB formed the IRP, comprising 0MB 
and agency officials, to conduct consistency reviews* and provide 
recommendations on selected PART appeals. The IRP also provided 0MB 
with a broad set of recommendations aimed at improving tiie PART based 
on IRP's experience with the consistency audit and appeals. 0MB was to 
finalize all PART assessments by the end of September 2002, although both 
agency and 0MB officials told us that changes and appeals continued 
through the end of the budget season. RMOs within 0MB provided draft 
summaries of PART results to the Director of 0MB during the Director’s 
review of agencies’ budget requests. The President’s fiscal year 2004 
budget (issued February 3, 2003) included a separate volume containing 
one-page summaries of the PART results for each of the 234 programs that 
were assessed.^ 

The relationship between PART and the administration’s proposals was 
presented in agencies’ budget justification materials sent to Congress. In an 
unprecedented move, 0MB also posted PART, one-page rating results, and 
detaOed supporting worksheets on its Web site. 0MB also included its Web 
address in the Performance and Management Assessments volume of the 
budget and, In the budget itself, also described PART and its process and 
asked for comments on how to improve PART. 

Figure 3 depicts a time line of the events related to the formulation of the 
President’s budget request, including the key stages of PART development. 


‘ According to 0MB, IRP performed consistency reviews on a stratified random sample of 
programs that completed the PART in preparation for the fiscal year 2004 budget, While IRP 
made recommendations regarding its findings, it did not have the authority to enforce them. 

* Mscal Year 2004 Budget qf the United States Government, Performance and 
Management Assessments, (Wa^ington, D.C.: February 2003). 
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Appendix III 
Development of PAKT 


Figure 3: The PART Process and Budget Formulation Timelines 



Fiscal Year 2005 fiscal year 2005 PART, 0MB moved the entire assessment process 

from the fall to spring. 0MB told us that the change was meant to help 
alleviate the burden of having the PART process overlap the end of the 
budget season, when workload is already so heavy. Another difference 
between the 2 years was that agency officials reported that 0MB was more 
coEaborative with the agencies in selecting the programs for the fiscal year 
2005 PART. 

TVaining on the PART assessments to be included in the President's fiscal 
year 2005 budget began in early May 2003. Agencies submitted PART 
appeals in early July, and 0MB aimed to resolve ttie appeals and finalize the 
PART scores by the end of July. In December of 2003, RMOs were to 
finalize the summaries of PART results, which will be published in 
February along with the fiscal year 2005 President’s Budget. 
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Appendix fV 

Comments from the Office of Management 
and Budget 


EXECUTIVE OFFICE OF THE PRESIDENT 
<^FiCC OF MANAGEMENT AND SUOGET 

WASHINGTON. DC 20S03 


January 16, 2004 


Managing Direcior 

Federai Budget uaJ iittergovemraentaJ Relations 
GeiKcal Accounting Office 
441 G Street, NW 
Washington. DC 20S4S 

Dear Mr, Posr»r; 

Thank you for the opportunity to comment on the draft GAO report on the PART 
{Performance Budgeting, (^servaiions on the tue of OMB ‘s Program Asseesment Rating Tool 
for the Fiscal Year 2004 Budget. G AO-04- 1 74). 

We appreciate GAO’s extensive revicv* of the PART process. We are particularly 
pleased that your report recogniaes the unprecedented transparency of the PART process and 
materials (hat we have posted on our website* and the extensive efforts OMB has taken to make 
the PART process consistent across the government. We will continually strive to make the 
PART as credible, objective, and useful as h can be and believe that your recommeodatioDS will 
help us do that. As you know, OMB is already taking actions to address many ofthem. For 
instance' 

• With r espe c t to centrally monitoring PART reconunerxiaiions, we have provided 
a simple format for agencies to follow vdten reporting the status of 
recommendation in^rlemcntation to OMB and 1 receive these reports semi- 
annually. We will continue to refine this process so that sufficient attention is 
given (o recommendation foUow-up. 

• As the PART reties on separate evaluations ofevidence of a program's success, 
we agree that the judgment about what constitutes a sufficient evaluation should 
be based on the quality of the evaluatioa 

• Except for programs of insignificant size or impacu we are committed to 
assessing 100 percent of programs using the PART. We are serushive to the 
impact this will have on OMB staff workload and will manage it accordingly. 

• One of the greatest oppoituruties for the PART is to compare the performance oC 
and share best practices among, like programs across government. We will 
cootloue to use the PART for that purpose. 

• We are woriting diligently to generate the meaningfiii dialogue with Congress you 
describe in your recommendations. 


' See, ^ft report Ai^cndix III, p. 2. 



OETUTY OIBCCTon 
rOB MANAOEMeNT 


Mr. Paul Posner 
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Appendix IV 

Comments firom the Office of Management 
and Budget 


• We will continue to in^Hove agency and Executive Branch implementation of 
CiPRA by inaistu^ GPRA plans and reports meet the rcquircmcDls of this 
important btw and tbc high standards set by the PART. 

Your n^ri makes valuable conclusbns and recommendations about the PART and our 
overall efferi to create a more resr^s-oricnled government. ! want to note that the PART was 
designed for and is used ki many ways other than just budget fomnilation. Performance 
information gleaned from the PART process has not only infomicd budget decisions, but has 
also helped direcs management, identified opportunities to improve program design, and 
promoted accoumabilfry. We believe that Uw PART will also greatly improve (he goals and 
measures adopted diroi^h the GPRA strategic and performance planning processes 

Tiank you for the opportuoiiy to review and conuneni on your draft report. I appreciate 
your willingness to take our oral and written comments into consideration in the final draft. 1 
look forward to working whh you to improve tbc ways in which we me creating a resufts- 
oriented govcnuncnl 


Sincerely, 
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Appendix V 

GAO Contacts and Staff Acknowledgments 
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